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Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 

NONSTATIONARITY,  NONLINEAR  DEPENDENCE,  AND  PREDICTION: 

AN  APPLICATION  TO  THE  TREASURY  BILL  FUTURES  MARKET 

By 

Jack  Praschnik 
August  1991 

Chairperson:  Professor  G.S.  Maddala 
Major  Department:  Economics 

This  study  describes  the  time  series  properties  of  U.S. 
Treasury  Bill  futures  prices  with  special  emphasis  on  unit 
root  nonstationarity , nonlinear  dependence,  and  prediction. 
Although  most  research  of  financial  markets  assumes  that 
market  prices  follow  a specific  martingale  process,  namely  the 
random  walk,  recently  researchers  have  begun  to  question  this 
assumption.  This  assumption  implies  futures  prices  must 
contain  a unit  root,  yet  many  studies  are  inconclusive  or 
contradictory  on  this  point.  In  chapter  2 several  tests  for 
nonstationarity  are  applied  and  it  is  shown  that  futures 
prices  undoubtedly  contain  a unit  root. 

A more  formal  analysis  of  the  random  walk  hypothesis  is 
conducted  in  chapter  3 by  looking  at  both  linear  and  nonlinear 
dependence  of  first  differences  of  prices.  Nonparametric  and 
parametric  tests  of  linear  dependence  are  conducted  and  the 
results  indicate  that  the  data  contains  no  significant  linear 


dependence.  However,  when  tests  for  nonlinear  dependence  were 
conducted,  the  results  from  every  test  indicated  the  nonlinear 
dependence. 

Based  on  the  results  from  chapter  3,  chapter  4 estimates 
nonlinear  models  and  uses  them  for  prediction.  in  this 
chapter  much  is  learned.  First,  some  nonlinear  models  are 
excluded  simply  by  their  poor  estimation  performance.  Second, 
when  comparing  the  models'  predictive  performance  to  the 
random  walk,  it  becomes  clear  that  the  nonlinearities  of  the 
data  are  exploitable.  Two  of  four  models  are  able  to  perform 
better  than  the  random  walk  especially  in  shorter  horizons. 
Third,  the  best  nonlinear  model  is  chosen  after  comparing  the 
predictions  of  all  the  nonlinear  models  against  each  other. 
It  is  shown  that  the  bilinear  model  is  the  best  of  the 
nonlinear  models.  Finally,  it  is  shown  that  the  bilinear 
model  outperforms  the  popular  autoregressive,  conditional, 
heteroskedastic  (ARCH)  model. 


CHAPTER  1 
INTRODUCTION 


General  Background 

The  martingale  process,  i.e.,  a stochastic  process  in 
which  the  expected  price  in  the  next  period  equals  the  current 
price,  has  an  established  record  in  characterizing  the  random 
nature  of  futures  prices.  Samuelson  (1965),  assuming  both 
perfect  capital  jnarkets  and  an  instantaneous  adjustment 
property,  was  the  first  to  formally  develop  a model  where 
futures  prices  are  characterized  by  a specific  martingale 
process  known  as  the  random  walk.  Since  then,  many  authors 
have  tested  the  random  walk  property  by  testing  first 
differences  for  serial  independence.^  The  results,  however, 
have  been  inconclusive.  Rocca  (1969)  and  Labys  and  Granger 
(1970)  both  concluded  that  the  martingale  process  provides  a 
good  description  of  futures  prices  even  though  minor 
departures  may  be  encountered.  However,  using  both  time  and 
frequency  domain  tests,  Cargill  and  Rausser  (1972,  1975) 

^ Note  that  for  a time  series  of  the  variable  x to  be  a 
martingale  process  the  only  requirement  is  that  E(Xt+i)=Xt  and 
E(et)~0  where  et=Xt+i-Xt.  But  for  x to  be  a random  walk  process 
the  residual  et  must  also  have  the  property  that  Cov (et, et+k)  =0 
for  all  k. 
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rejected  the  random  walk  hypothesis.  Because  previous  time 
and  frequency  domain  tests  both  assume  that  futures  prices  are 
normally  distributed,  Mann  and  Heifner  (1976)  used  two 
nonparametric  tests  to  test  the  random  walk  hypothesis.  They 
also  rejected  the  hypothesis  after  looking  at  prices  for  nine 
commodities  over  a twelve  year  span. 

In  addition  to  studies  of  the  residuals  of  price 
differences,  tests  for  nonstationarity  can  also  be  employed  to 
address  the  same  question.  Recall  that  a martingale  process 
is  a stochastic  process  in  which  the  expected  price  in  the 
next  period  equals  the  current  price.  Then  if  futures  prices 
can  be  described  by  this  type  of  process,  they  should  at  least 
contain  a unit  root  in  their  autoregressive  representation. 
Goldenberg  (1989)  finds  a unit  root  in  daily  S&P  500  futures 
prices.  In  addition,  Doukas  (1990)  found  that  futures  prices 
for  some  commodities,  namely  soybeans,  soy  meal,  and  soy  oil, 
contain  a unit  root.  These  papers  give  some  validity  to  the 
martingale  hypothesis,  but  by  themselves  cannot  be  conclusive. 

All  of  these  studies  of  futures  prices  above  have  tried 
to  investigate  the  martingale  hypothesis  or  more  specifically 
the  random  walk  hypothesis  by  testing  the  existence  of  linear 
dependence.  It  is  possible,  however,  that  the  random  walk 
hypothesis  may  be  violated  by  the  existence  of  nonlinear 
dependence.  Indirectly,  some  authors  have  addressed  this 
possibility  by  showing  that  profitable  trading  rules  may  exist 
even  when  changes  in  futures  prices  are  serially  uncorrelated. 
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Leuthold  (1972),  in  investigating  the  futures  market  for 
cattle,  used  filter  rules  to  show  that  profitable  trading 
rules  existed  for  the  period  1965-1970.  In  the  same  paper,  he 
showed  that  these  rules  may  exist  even  when  spectral  analysis 
indicates  that  price  changes  are  random.  Nonlinear  dependence 
has  been  found  in  other  financial  data,  but  direct  tests  for 
nonlinear  dependence  in  futures  prices  have  not  been 
conducted. 

Because  nonlinear  dependence  has  been  found  in  the 
residuals  of  price  changes  in  other  financial  markets,  it 
seems  useful  to  test  for  nonlinear  dependence  in  futures 
markets.  There  are  several  models  that  are  good  candidates 

financial  data,  but  one  family  of  models,  the 
autoregressive,  conditional  heteroskedastic  (ARCH)  family,  has 
become  the  most  popular  univariate  time  series  model.  The 
cause  of  this  popularity  is  unclear.  Other  models  are  just  as 
easy  to  apply  and  have  an  intuitive  appeal  that  is  as  good  or 
better. 


Purpose  of  the  Study 

Because  the  random  walk  hypothesis  and  time  series 
properties  of  futures  prices  are  still  a subject  of  debate, 
the  present  dissertation  examines  the  statistical  nature  of 
futures  prices  in  detail.  As  the  title  of  the  dissertation 
suggests,  nonstationarity , nonlinear  dependence,  and 
prediction  will  be  the  focus  of  the  analysis.  Given  the  size 
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of  the  particular  futures  market  chosen  and  the  topics 
selected  for  discussion,  the  analysis  will  be  sufficient  to 
shed  some  light  on  the  overall  behavior  of  futures  prices  in 
financial  markets. 

In  the  second  chapter,  the  property  of  nonstationarity  is 
examined  by  first  discussing  and  then  applying  tests  for 
nonstationarity  to  the  futures  contracts  chosen.  Four 
different  tests  are  used  and  some  are  used  with  different  lag 
structures  to  account  for  any  serial  correlation  found  in  the 
residuals  of  the  tests'  regression  equations.  Before 
concluding  that  the  data  are  stationary  or  nonstationary, 
however,  the  appropriate  Dickey-Fuller  model  of  the  data  is 
considered.  This  entails  analyzing  which  first  order 
autoregressive  representation,  i.e.,  with  no  constant,  just  a 
constant,  or  a constant  and  a trend  term,  is  the  one  that  fits 
the  data  best.  Test  results  indicate  that  first  differences 
of  prices  are  covariance  stationary  and  give  us  a necessary 
condition  to  further  investigate  the  random  walk  hypothesis. 

Given  the  unanimous  results  from  the  tests  for 
nonstationarity,  the  third  chapter  investigates  the  random 
walk  hypothesis  even  further  by  applying  parametric  and 
nonparametric  tests  for  linear  dependence,  a general  test  of 
dependence,  and  tests  for  nonlinear  dependence  to  first 
differences  of  prices.  As  opposed  to  previous  studies  of 
futures  prices,  which  used  indirect  tests  for  nonlinear 
dependence,  direct  tests  for  nonlinear  dependence,  which  are 
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based  on  new  time  series  techniques,  are  used  in  this  chapter. 
Amonq  the  tests  used,  I am  first  to  apply  a very  powerful  test 
known  as  the  Brock,  Dechert,  and  Scheinkman  (BDS)  test.  This 
BDS  test  is  based  on  the  correlation  integral  which  is  used  in 
physics  as  a measure  of  clustering.  Significant  linear 
dependence  is  rejected  by  all  of  the  tests,  but  nonlinear 
dependence  appears  in  every  data  set.  This  result  leaves  the 
random  walk  hypothesis  in  question  and  points  to  the  use  of 
nonlinear  models  as  the  most  appropriate  class  of  models  to 
describe  futures  price  data. 

Modeling  the  data  is  taken  up  in  the  fourth  chapter. 
Several  nonlinear  models  are  applied  to  the  data,  namely,  an 
autoregressive,  conditional  heteroskedastic  in  mean  (ARCH-M) 
model  (Engle,  Li»lien,  and  Robins,  1987),  a generalized 
autoregressive,  conditional  heteroskedastic  in  mean  (GARCH-M) 
model  (Bollerslev,  1986),  a bilinear  model  (Granger  and 
Andersen,  1978a),  a time-varying  parameter  model,  a time- 
series  segmentation  model  (Sclove,  1983),  and  the  stochastic, 
segmented  trends  model  (Hamilton,  1989) . First,  the  models 
are  estimated.  Because  the  time-varying  parameter  and  time 
series  segmentation  models  do  not  fit  this  data,  they  are 
discarded.  The  remaining  models  are  estimated  and  used  for 
out-of-sample  prediction  by  reserving  the  last  50  days  of  data 
for  each  contract.  Using  two  criteria,  the  mean  square  error 
of  prediction  and  fheil's  U statistic,  the  models'  predictions 
are  first  compared  to  the  prediction  for  a standard 
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martingale,  the  random  walk,  and  then  compared  to  one  another. 
The  bilinear  model  predicts  best  and  some  conclusions  are 
drawn  as  to  the  use  of  the  family  of  ARCH  models  when  modeling 
financial  futures  data. 

Data  Description 

The  data  are  daily  settlement  prices  for  90-day  U.S. 
Treasury  Bill  futures  contracts.  The  contracts  chosen  for 
analysis  in  this  dissertation  were  the  five  most  recent 
contracts  available  at  the  start  of  my  research.  These 
contracts  matured  in  third,  sixth,  ninth,  and  twelfth  months 
of  1988  and  the  third  month  of  1989  and  hereafter  are  referred 
to  as  contracts  88(3),  88(6),  88(9),  88(12),  89(3) 
respectively.  After  discarding  the  last  month  of  trading  for 
each  contract  to  avoid  dependencies  caused  by  the  convergence 
of  futures  prices  to  spot  prices,  there  were  approximately  450 
observations  for  each  contract.^ 

A simple  reason  that  this  particular  futures  market  is 
chosen  is  that  it  is  representative  of  all  other  financial 
futures  markets,  especially  futures  markets  of  other  short- 
term credit  instruments,  by  the  dollar  amount  of  transactions 
and  volume  traded  on  the  market  on  any  given  day.  In 
addition,  it  is  the  largest  domestically  traded  futures 

^ The  first  month  of  trading  under  contract  88(9)  was 
characterized  by  dramatic  upward  and  downward  swings  along 
with  terribly  low  volumes  of  trading.  For  this  reason  this 
month  of  trading  was  also  discarded  to  avoid  any  unexplainable 
dependencies  that  this  behavior  may  cause  to  appear. 
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contract  and  is  most  often  used  as  an  indicator  of  future 
interest  rates. 

To  investigate  the  random  walk  hypothesis,  first 
differences  are  used  in  all  of  the  tests,  i.e.,  e^  = - Pfi, 
where  P^  is  the  daily  settlement  price  in  time  period  t.  To 
get  a feel  for  the  data,  table  1.1  provides  the  summary 
statistics  for  these  changes. 


Table  1.1 

SUMMARY  STATISTICS  FOR  DAILY  PRICE  CHANGE 
et  = Pt  - Pt-i 


Contracts 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

N 

Mean 

SD 

476 
.0027 
. 1107 

434 

-.0010 

.1121 

409 

-.0028 

.1073 

472 

-.0007 
. 1038 

462 

-.0028 

.1033 

Skewness 

Kurtosis 

1.8789 

9.9306 

. 6724 
20.7811 

1.5254 

19.8099 

1.3058 

18.9414 

1.3720 

20.9839 

Maximum 

Minimum 

T-stat 

1.06 
-.36 
. 0001 

1.01 

-.77 

.0003 

.99 
-.55 
. 00002 

.97 

-.60 

-.0002 

.98 

-.64 

-.0002 

N indicates  the  number  of  observations  and  the  T-statistic  is  from  an  OLS  regression  of  daily 
price  changes  on  time. 


The  skewness  and  kurtosis  coefficients  differ  greatly 
from  those  found  on  a normal  distribution  (0  and  3 
respectively) . In  all  of  the  samples,  the  density  is  skewed 
to  the  right  and  the  size  of  the  kurtosis  coefficients 
indicates  that  the  density  is  far  more  peaked  around  its 
center  than  the  density  of  a normal  random  variable 
( leptokurtic) . Note  that  if  the  density  of  price  changes  is 
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nonnormal,  then  satisfying  simple  tests  for  randomness  will 
only  indicate  that  e^  is  uncorrelated  over  time.  Without 
normality,  statistical  independence  cannot  be  inferred  from 
these  results. 

In  figures  l.l  through  1.5,  the  levels  of  futures  prices 
for  contracts  88(3)  through  89(3)  are  graphed  against  time  and 
in  figures  1.6  through  1.10  price  changes  are  graphed  against 
time  and  presented  in  the  same  order.  The  levels  of  the  data 
appear  to  be  autocorrelated  both  negatively  and  positively 
over  different  periods  of  time.  In  addition,  there  seem  to  be 
long  periods  where  futures  prices  move  in  one  direction. 
Hence,  the  statistical  models  proposed  by  Sclove  (1983)  and/or 
Hamilton  (1989)  , which  will  be  discussed  in  chapter  4,  seem 
to  be  applicable. 

The  changes  in  futures  prices,  on  the  surface,  are  less 
informative,  although  it  appears  that  the  data  are  bounded  and 
linearly  independent.  In  addition,  a simple  inspection  of  the 
way  the  amplitude  changes  over  time  may  lead  one  to  believe 
that  the  data  could  have  been  generated  by  some  linear 
martingale  process.  However,  it  is  also  known  that  graphs  of 
bilinear,  ARCH,  or  GARCH  processes,  processes  that  will  also 
be  discussed  in  chapter  4,  could  look  this  way.  Because  it  is 
difficult  to  visually  detect  whether  the  amplitude  of  the 
series  changes  over  time  or  is  related  over  time,  we  will 
leave  it  up  to  the  estimation  of  models  in  chapter  4 for  more 
information.  The  ability  of  a model  to  predict  as  well  as 
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some  standard  residual  diagnostic  tests  should  distinguish  the 
most  appropriate  model  for  the  data. 
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Futures  Prices  For  Contract  88(3) 
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FIGURE  1.2  Futures  Prices  For  Contract  88(6) 
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FIGURE  1.3  Futures  Prices  For  Contract  88(9) 
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FIGURE  1.4  Futures  Prices  For  Contract  88(12) 


14 


I 


CNI 

00 

ID 

(N 

rO 

oo 

CO 

(N 

(N 

00 

CO 

•<r 

oi 

rO 

rO 

fO 

rO 

oi 

CNJ 

CxJ 

CN 

oi 

oi 

oi 

O) 

o> 

oi 

oi 

oi 

oi 

oi 

oi 

oi 

O^ 

oi 

(SpUDSnOL|J_) 

(C)69  430Jiuoo  jo^  00  1.  x S3jn;nj 


FIGURE  1.5  Futures  Prices  For  Contract  89(3) 
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FIGURE  1.8  Changes  In  Futures  Prices  For  Contract  88(9) 
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FIGURE  1.9  Changes  In  Futures  Prices  For  Contract  88(12) 
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Layout  of  the  Dissertation 
As  the  title  of  this  thesis  suggests,  U.S.  Treasury 
Bill  futures  data  are  the  primary  focus  of  the  analysis. 
Given  the  data,  the  thesis  has  three  objectives; 
characterization,  estimation,  and  prediction.  First,  the  time 
series  properties  of  data  are  characterized.  By  this  it  is 
meant  that  the  properties  of  nonstationarity  and  nonlinearity 
are  investigated.  There  has  been  somewhat  of  an  ongoing 
curiosity  as  to  whether  futures  prices  contain  a unit  root.^ 
The  property  of  nonstationarity  is  addressed  and  several 
different  tests  for  unit  roots  are  applied  to  the  data  in  the 
second  chapter. 

In  the  third  chapter,  the  property  of  nonlinear 
dependence  is  examined.  In  a preliminary  analysis  of  some 
summary  statistics  of  the  data,  it  seems  likely  that  nonlinear 
dependence  is  an  intrinsic  part  of  the  data.  Since  this 
property  has  recently  been  found  in  other  financial  markets, 
namely,  the  foreign  exchange  rate  market  and  stock  market,  it 
raises  even  more  suspicion.  First,  the  data  are  checked  for 
any  dependence,  linear  or  nonlinear,  by  applying  some  general 
tests  of  dependence.  Then  the  data  are  purged  of  any  linear 
dependence  by  regressing  price  differences  on  ten  lags  using 
ordinary  least  squares  estimation.  This  is  done  to  avoid  any 
sensitivity  that  tests  for  nonlinearity  may  have  for  linear 

^ A full  discussion  of  this  curiosity  is  given  in  the 
introduction  to  chapter  2. 
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dependence.  The  residuals  from  this  procedure  are  then 
examined  for  nonlinear  dependence. 

It  is  hypothesized  that  the  data  have  at  least  one  of  two 
types  of  nonlinear  dependence,  multiplicative  and/or  additive. 
Using  a new  test,  known  as  the  third  order  moment  test,  the 
nonlinear  dependence  found  in  the  data  is  then  classified  as 
one  of  these  types.  Whichever  type  of  dependence  is  found, 
this  information  can  then  be  used  to  identify  the  most 
appropriate  nonlinear  models. 

In  the  fourth  chapter,  the  second  objective,  estimation, 
is  addressed.  Here,  I estimate  several  univariate  time  series 
models,  evaluate  their  suitability,  and  consider  the  way 
nonlinearities  enter  the  data  and  their  implications  for  the 
importance  of  nonlinearities. 

Prediction  using  the  nonlinear  models  is  the  last 
objective  and  is  also  encountered  in  the  fourth  chapter.  It 
is  here  that  two  important  questions  are  answered.  First,  can 
the  nonlinearities  that  exist  in  the  data  be  exploited  to  earn 
profits?*  Securities  traders  as  well  as  other  researchers, 
who  have  not  been  able  to  use  nonlinear  dependencies  to  assist 
in  predicting  the  mean  of  the  process  in  other  financial 
markets,  will  find  both  this  question  and  its  answer 
interesting.  This  question  is  answered  by  comparing 
predictions  from  the  nonlinear  models  to  the  prediction  from 

* Note  that  the  ability  to  make  profits  will  depend  not 
only  on  a successful  model,  but  on  the  costs  of  trading 
securities. 
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a simple  linear  model  since  it  is  believed  that  futures  prices 
behave  like  random  walks.  The  second  question  is  concerned 
with  the  most  appropriate  nonlinear  model.  One  particular 
family  of  nonlinear  models,  called  ARCH  models,  have  been  used 
and  sometimes  abused  by  researchers  when  modeling  financial 
data.  It  is  in  this  chapter  that  ARCH  models  are  compared 
with  other  nonlinear  models  of  futures  prices. 

In  the  conclusion  of  this  dissertation,  several  important 
discoveries  are  pointed  to.  The  results,  taken  as  a whole, 
should  prove  useful  for  researchers  of  futures  markets  and 
will  offer  food  for  more  research  on  the  time  series 
properties  of  futures  markets  in  general. 


CHAPTER  2 
NONSTATIONARITY 

Introduction 

It  has  long  been  assumed  that  changes  in  futures  prices 
are  covariance  stationary  processes  (see,  for  example,  Telser, 

1967;  Stevenson  and  Bear,  1970;  Martell  and  Helms,  1978;  and 

« 

Trevino  and  Martell,  1984).  By  this  it  is  meant  that, 

although  the  probability  distribution  of  the  series  may  change 
over  time,  the  mean  and  variance  of  the  series  do  not  change 
with  time  and  the  covariance  between  two  realizations  in  time 
depends  only  on  the  time  difference,  not  on  the  time  instant. 
This  assumption  of  covariance  or  wide-sense  stationarity  is 
necessary  for  time-invariant  representations  of  futures  prices 
in  terms  of  their  conditional  expectations.  In  addition,  for 
any  of  the  ergodic  theorems  to  hold,  stationarity  is 
necessary.  Cargill  and  Rausser  (1975),  Stevenson  and  Bear 
(1970) , and  Alexander  (1961)  report  trends  in  commodity 
futures  prices,  Goldenberg  (1989)  finds  a unit  root  in  daily 
S&P  500  futures  prices,  and  Doukas  (1990)  finds  a unit  root  in 
daily  soy  meal,  soybean,  and  soy  oil  futures  prices.  The 
conflicting  discoveries  on  the  issue  of  stationarity  in 
futures  prices  suggest  that  a formal  test  of  the  data  is 
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required  before  any  other  time  series  analyses  may  be 
conducted.  In  this  chapter  I set  out  to  establish  whether  or 
not  futures  prices  are  nonstationary. 

For  the  simplest  of  unit  root  tests,  under  the  null 
hypothesis,  the  assumption  is  that  the  data  follow  a random 
walk,  i.e.,  + e^,  where  Pt  is  the  daily  settlement 

price  of  a futures  contract  in  period  t and  e^  is  an 
independently  and  identically  distributed  (i.i.d.)  normal 
random  variable  with  mean  0 and  variance  a^.  A conventional 
and  easily  applied  test  for  nonstationarity  is  the  DF  test, 
suggested  by  Dickey  and  Fuller  (1979).  This  test,  however,  is 
somewhat  limited  since  the  error  term  is  assumed  to  be 
strictly  i.i.d.  N(0,a^)  under  the  null  hypothesis.  Recently, 
a lot  of  effort  has  been  exerted  on  developing  tests  that 
relax  this  assumption.  The  Dickey-Fuller  test  for  unit  roots 
in  the  standard  AR(1)  model  can  be  generalized  to  test  for 
unit  roots  in  an  AR(p)  model.  The  Augmented  Dickey-Fuller 
(ADF)  test,  suggested  by  Said  and  Dickey  (1984),  extends  the 
Dickey-Fuller  test  to  account  for  serial  correlation  that  is 
typically  produced  by  autoregressive  moving  average  (ARMA) 
models.  Two  tests  that  nonparametrically  adjust  the  DF  test 
to  correct  for  inf inite— dimensional  nuisance  parameters 
associated  with  e^  are  the  and  tests  suggested  by 
Phillips  (1987)  and  Phillips  and  Perron  (1988)  respectively. 
The  Phillips'  tests  are  designed  to  handle  generalized  forms 
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of  serial  correlation  and/or  heteroskedasticity  that  may  be 
contained  in  ef 

In  general,  ’unit  root  tests  have  received  a lot  of 
attention  in  contemporary  research  and  have  become  major  tools 
in  time  series  analyses.  This  chapter  proceeds  as  follows. 
In  the  next  section,  the  details  of  the  unit  root  tests 
applied  in  this  chapter  are  given.  In  the  section  entitled 
"Testing  the  Data  for  Nonstationarity , " I present  the  results 
from  applying  these  tests  to  the  data.  I distinguish  the 
appropriate  Dickey-Fuller  model  to  be  used  in  the  unit  root 
tests,  the  most  suitable  random  walk  model  to  be  used  in 
subsequent  chapters,  and  briefly  conclude  in  the  last  section. 

Tests  for  Nonstationarity 

A major  branch  of  the  literature  contains  tests  that  are 
all  based  on  the  following  observation.  Consider  the  simplest 
data  generation  process  that  allows  one  to  discuss  the  concept 
behind  these  tests; 

= P-Pfi  “t'’  Uj.~i  . i . d.  (0,  o^)  (2.1) 

Po  = 0. 

If  the  null  hypothesis  is  Hq;  p = po,  where  I pj  < i,  then  the 
t-statistic  is  asymptotically  normally  distributed.  If  pg 
“ i/  then  the  test  statistic  is  no  longer  asymptotically 
normal.  The  resulting  distribution  is  not  even  symmetric. 
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Critical  values  for  the  hypothesis  of  a unit  root  are  found  by 
using  Monte  Carlo  simulation.  They  were  first  tabulated  by 
Dickey  and  presented  in  Fuller  (1976).  This  simple  test  is 
known  as  the  Dickey-Fuller  (DF)  test.  The  critical  values 
that  they  tabulated,  which  are  presented  below,  are  based  on 
the  following  three  models. 

APj  = (p-l)Pj..^  + Uj.  (2.2) 

APj  = a + (p-l)Pj.^  ■*-  Uj.  (2.3) 

APj  = a + yt  + (p-l)P£..^  + Uj.  (2.4) 

where  APt  = Pt  - Pfi-  If  we  let  the  sample  size  = T and 
a = the  cumulative  probability,  then  the  critical  values,  in 
tables  2. 1-2. 3 below,  correspond  to  the  statistic 
(P“l)/SE(p)  for  models  2. 2-2. 4 respectively. 


Table  2 . 1^ 


Empirical  Cumulative  Distribution  of  (p-l)/SE(p) 

for  Model  2 . 2 


T 

a=  0.01 

0.025 

0.05 

0.10 

0.90 

0.95 

0.975 

0.99 

25 

-2.66 

-2.26 

-1.95 

-1.60 

.92 

1.33 

1.70 

2.16 

50 

-2.62 

-2.25 

-1.95 

-1.61 

.91 

1.31 

1.66 

2.08 

100 

-2.60 

-2.24 

-1.95 

-1.61 

.90 

1.29 

1.64 

2.03 

250 

-2.58 

-2.23 

-1.95 

-1.62 

.89 

1.29 

1.63 

2.01 

500 

-2.58 

-2.23 

-1.95 

-1.62 

.89 

1.28 

1.62 

2.00 

00 

-2.58 

-2.23 

-1.95 

-1.62 

.89 

1.28 

1.62 

2.00 

Tables  2. 1-2. 6 are  found  in  Fuller  (1976). 
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Table  2.2 


Empirical  Cumulative  Distribution  of  (p-l)/SE(p) 

for  Model  2.3 


T 

a=  0.01 

0.025 

0.05 

0.10 

0.90 

0.95 

0.975 

0.99 

25 

-3.75 

-3.33 

-3.00 

-2.63 

-0.37 

0.00 

0.34 

0.72 

50 

-3.58 

-3.22 

-2.93 

-2.60 

-0.40 

-0.03 

0.29 

0.66 

100 

-3.51 

-3.17 

-2.89 

-2.58 

-0.42 

-0.05 

0.26 

0.63 

250 

-3.46 

-3.14 

-2.88 

-2.57 

-0.42 

-0.06 

0.24 

0.62 

500 

-3.44 

-3.13 

-2.87 

-2.57 

-0.43 

-0.07 

0.24 

0.61 

(X) 

-3.43 

-3.12 

-2.86 

-2.57 

-0.44 

-0.07 

0.23 

0.60 

Table  2.3 


Empirical  Cumulative  Distribution  of  (p-l)/SE(p) 

for  Model  2.4 


T 

a=  0.01 

0.025 

0.05 

0.10 

0.90 

0.95 

0.975 

0.99 

25 

-4 . 38 

-3.95 

-3.60 

-3.24 

-1.14 

-0.80 

0.50 

0.15 

50 

-4.15 

-3.80 

-3.50 

-3.18 

-1.19 

-0.87 

0.58 

0.24 

100 

-4 . 04 

-3.73 

-3.45 

-3.15 

-1.22 

-0.90 

0.62 

0.28 

250 

-3.99 

-3. -69 

-3.43 

-3.13 

-1.23 

-0.92 

0.64 

0.31 

500 

-3.98 

-3.68 

-3.42 

-3.13 

-1.24 

-0.93 

0.65 

0.32 

00 

-3.96 

-3.66 

-3.41 

-3.12 

-1.25 

-0.94 

0.66 

0.33 

Concurrently,  Dickey  and  Fuller  presented  an  expanded  version 
of  the  DF  test  in  Dickey  and  Fuller  (1979) . The  initial  test 
discussed  above  handles  the  AR(1)  case,  whereas  the  expanded 
version  handles  the  AR(p)  case.  For  Pt  as  an  AR(p)  process 

Pt  = ^ e,  (2.5) 

j-i 

a test  can  be  constructed  by  using  the  regression  model 

AP,  = (p-DPc-i  + EYjAPe.j  + Uc 


(2.6) 
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where  APt  = - Pfi-  The  test  statistic,  as  in  the  DF  test, 
is  the  t-statistic  for  the  coefficient  of  Pfi-  This  model  can 
also  be  extended  to  include  a constant  and  a trend  just  as  the 
models  in  2.3  and  2.4.  The  model  presented  in  (2.6)  is  known 
as  the  Augmented  Dickey-Fuller  (ADF)  test. 

Said  and  Dickey  (1984)  show  that  the  ADF  test  can  also  be 
used  to  test  for  unit  roots  even  when  the  error  term  u^ 
follows  an  MA  process  or  a general  ARMA(p,q)  process,  so  long 
as  the  ARMA  process  is  stationary  and  invertible.  The  only 
proviso  is  that  p rises  with  the  sample  size  T so  that  there 
exists  numbers  c>0  and  r>0,  such  that  cp  > and  T'^^^p  0. 
Theoretically,  many  choices  of  p can  satisfy  this  requirement. 
Schwert  (1989)  showed  that  tests  for  nonstationarity  are 
affected  by  the  presence  of  a moving  average  or  invertible 
autoregressive  parameter  in  the  residuals  of  Dickey-Fuller 
models  given  in  equations  (2.2a-c).  However,  depending  on  the 
value  of  this  parameter,  different  lengths  of  p are 
appropriate.  Hence,  Schwert  (1989)  suggests  that  p be  chosen 
according  to  the  following  two  equations,  one  which  gives  a 
shorter  length  of  p and  the  other  a longer  length. 

= JiVT[4  (T/100)  (2.7) 

1^2  = INT[12(T/100)^^*]  (2.8) 
where  INT[.]  denotes  the  integer  component.  By  using  these 
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two  values  for  "1",  two  ADF  statistics,  ADF[4]  and  ADF[12}, 
corresponding  to  1*  and  I12  respectively,  are  constructed. 

The  adjustment  to  the  DF  test  that  the  ADF  test  makes  is 
simply  one  to  retain  the  validity  of  the  assumption  of  white 
noise  errors  in  the  DF  regression.  The  Z„  and  tests, 
suggested  by  Phillips  (1987)  and  Phillips  and  Perron  (1988) 
respectively,  are  tests  that,  instead  of  adjusting  the  DF 
regression  before  estimation,  modify  the  DF  regression  after 
estimation  through  a nonparametric  adjustment.  Hence,  the 
error  term  is  not  assumed  to  follow  a white  noise  process. 
Like  the  ADF  test,  these  tests  handle  possible  autocorrelation 
that  may  exist  between  the  first  differences  of  P^.  In 
addition,  these  tests  make  allowances  for  heteroskedasticity 
that  the  residuals  of  the  DF  regression  may  exhibit. 
Formally,  to  find  both  the  and  test  statistics  one  starts 
from  the  DF  regression,  i.e., 

APj  = a + (2.9) 

The  Z„  statistic  is  equal  to  T)3  - AD„  where  ^ is  the  OLS 
estimate  of  p in  equation  (2.9)  and  AD„  is  defined  as 


AD, 


iE  (ft-i 

^ C-2 


(2.10) 


where  s^  is  the  maximum  likelihood  estimate  of  the  sample 
variance  of  the  residuals  That  is 
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(2.11) 


and 


C-l  J*1  C»J*1 


(2.12) 


Newey  and  West  (1987)  suggested  the  weights 

2)ji  = {1  - j/(l  + 1)}  to  ensure  that  the  estimate  of  the 
variance  sli  is  positive  as  well  as  consistent.  The  condition 
on  the  lag  structure  is  simply  that  1 -»  « as  T -*  »,  such  that 
1 is  o(T^^*)  . Note  that  Schwert's  (1989)  suggestion  satisfies 
this  condition.  Hence,  I use,  as  Schwert  did,  1*  and  1^2  to 
calculate  the  appropriate  number  of  lags  in  s^^  and  report  two 
Z„  statistics.  To  calculate  the  statistic,  Ct  is  replaced  by 
its  OLS  estimate  e^.  The  Z„  test  uses  critical  values  that  are 
used  for  the  alternative  expression  for  the  DF  test  statistic, 
T)9.  These  critical  values  are  given  in  tables  2. 4-2. 6 below. 

The  Zt  statistic  is  defined  as 


where  t^  is  the  Student  t-statistic  of  ^ from  the  simple  DF 
regression  and  s^  and  sfi  are  defined  as  above.  As  for  the  Z„, 
there  are  two  statistics  for  the  statistic,  one 
corresponding  to  each  lag  structure,  1^  and  l^j. 


(2.13) 


The  critical 
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values  for  the  statistic  are  identical  to  those  used  for 
the  DF  and  ADF  tests  above. 


Table  2.4 


Empirical 

Cumulative  Distribution  of 
for  Model  2.2 

T 

a=  0.01 

0.025  0.05  0.10  0.90 

0.95 

0.975 

0.99 

25 

-11.9 

-9.3 

-7.3  -5.3  1.01 

1.40 

1.79 

2.28 

50 

-12.9 

-9.9 

-7.7  -5.5  0.97 

1.35 

1.70 

2.16 

100 

-13.3 

-10.2 

-7.9  -5.6  0.95 

1.31 

1.65 

2.09 

250 

-13.6 

-10.3 

-8.0  -5.7  0.93 

1.28 

1.62 

2.04 

500 

-13.7 

-10.4 

-8.0  -5.7  0.93 

1.28 

1.61 

2.04 

00 

-13.8 

-10.5 

-8.1  -5.7  0.93 

1.28 

1.60 

2.03 

Table  2.5 

Empirical 

Cumulative  Distribution  of  T^ 

for  Model  2 . 3 

T 

a=  0.01 

0.025  0.05  0.10  0.90 

0.95 

0.975 

0.99 

25 

-17.9 

-14.6 

-12.5  -10.2  -0.76 

0.01 

0.69 

1.40 

50 

-18.9 

-15.7 

-13.3  -10.7  -0.81 

-0.07 

0.53 

1.22 

100 

-19.8 

-16.3 

-13.7  -11.0  -0.83 

-0.10 

0.47 

1.14 

250 

-20.3 

-16.6 

-14.0  -11.2  -0.84 

-0.12 

0.43 

1.09 

500 

-20.5 

-16.8 

-14.0  -11.2  -0.84 

-0.13 

0.42 

1.06 

00 

-20.7 

-16.9 

-14.1  -11.3  -0.85 

-0.13 

0.41 

1.04 

Table  2 . 6 

Empirical 

Cumulative  Distribution  of  T^ 

for  Model  2.4 

T 

a=  0.01 

0.025  0.05  0.10  0.90 

0.95 

0.975 

0.99 

25 

-22.9 

-19.3 

-17.9  -15.6  -3.66  - 

2.51 

-1.53  - 

0.43 

50 

-25.7 

-22.4 

-19.8  -16.8  -3.71  - 

2.60 

-1.66  - 

0.65 

100 

-27.4 

-23.6 

-20.7  -17.5  -3.74  - 

2.62 

-1.73  - 

0.75 

250 

-28.4 

-24.4 

-21.3  -18.0  -3.75  - 

2.64 

-1.78  - 

0.82 

500 

-28.9 

-24.8 

-21.5  -18.1  -3.76  - 

2.65 

-1.78  - 

0.84 

00 

-29.5 

-25.1 

-21.8  -18.3  -3.77  - 

2.66 

-1.79  - 

0.87 
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Both  Za  and  7.\  test  statistics,  as  the  DF  and  ADF  tests, 
can  be  based  upon  the  three  different  models,  2. 2-2. 4,  above. 
However,  model  2.4  must  be  modified  when  used  for  formulating 
the  Z statistics.  We  modify  2.4  as  follows: 

AP^  = a + y{C-T/2)  + (p-l)P^_^  + e^.  (2.4*) 


Testing  the  Data  for  Nonstationarity 
In  this  section  I apply  the  tests  described  above  to  the 
Treasury  Bill  futures  data.  The  three  regression  models 
listed  in  2. 2-2. 4,  using  2.4'  where  appropriate,  are  employed 
under  each  test.  In  addition,  the  lag  structures  1^  and  l^j 
are  used  for  the  ADF,  Z„,  and  Z^  tests.  In  total,  twenty-one 
statistics  will  be  presented.,  seven  for  each  model  2. 2-2. 4. 
Tables  2. 7-2. 9 correspond  to  the  models  2. 2-2. 4 respectively. 


Table  2.7 


Unit  Root  Tests  with  Zero  Mean  and  Trend  under  the  Ho 

(Model  2.2) 


Tests 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

DF 

0.538 

-0.202 

-0.492 

-0.535 

-0.591 

ADF(4) 

0.335 

0.087 

-0.689 

-0.604 

-0.730 

ADF (12) 

0.276 

0.223 

-0.485 

-0.668 

-0.458 

Za(4) 

-1.278 

-0.155 

1.604 

0.883 

1.916 

Za(12) 

0.065 

0.019 

-0.026 

-0.011 

-0.037 

Zt(4) 

0.160 

-0.238 

-0.153 

-0.361 

-0.247 

Zt(12) 

-0.139  • 

-0.467 

-0.355 

-0.554 

-0.299 

88(3)-89(3)  denotes  the  five  contracts.  All  test  statistics  in  the  table,  except  those 
for  the  Za  test,  should  be  compared  to  the  critical  values  found  in  table  2.1.  The 
critical  values  for  the  Za  test  should  be  compared  to  the  critical  values  found  in  table 
2.4.  * and  **  denotes  a rejection  of  the  Ho  for  a one-sided  test  at  the  5X  and  IX  levels 

of  significance  respectively.  Note  that  large  positive  statistics  indicate  a rejection 
of  a unit  root  but  not  of  nonstationarity. 
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Table  2.8 


Unit  Root  Tests  with  Nonzero  Mean  under  the  Ho 

(Model  2.3) 


Tests 

88(3) 

88(6) 

88  (9) 

88(12) 

89(3) 

DF 

-1.744  ' 

-2.024 

-1.948 

-1.925 

-1.647 

ADF(4) 

-1.913 

-1.686 

-1.766 

-1.783 

-1.181 

ADF(12) 

-2.059 

-2.149 

-2.214 

-2.398 

-1.518 

Za(4) 

-8.486 

-7.906 

-6.177 

-6.855 

-5.763 

Za(12) 

-6.846 

-7.412 

-7.468 

-8.370 

-7.238 

Zt(4) 

-1.813 

-2.016 

-1.993 

-1.931 

-1.582 

Zt(12) 

-1.937 

-2.044 

-1.947 

-1.938 

-1.633 

S«*  notsi  under  table  2.7,  The  critical  values  are  found  in  tables  2.2  and  2.5. 


Table  2.9 

Unit  Root  Tests  with  Nonzero  Mean  and  Trend  under  the  Ho 

(Model  2.4) 


Tests 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

DF 

-1.745 

-2.027 

-1.945 

-1.977 

-1.679 

ADF(4) 

-1.911 

-1.684 

-1.756 

-1.816 

-1.194 

ADF(12) 

-2.044 

-2.146 

-2.214 

-2.393 

-1.486 

Za(4) 

-8.497  • 

-7.893 

-6.190 

-7.268 

-5.863 

Za(12) 

-6.874 

-7.425 

-7.479 

-7.848 

-7.389 

Zt(4) 

-1.816 

-2.019 

-1.991 

-1.987 

-1.618 

Zt(12) 

-1.932 

-2.042 

-1.945 

-1.987 

-1.658 

See  notes  under  table  2.7.  The  critical  values  are  found  in  tables  2.3  and  2.6. 


The  test  statistics  applied  to  every  model  unanimously 
indicate  that  the  data  is  nonstationary  and  the  unit  root 
hypothesis  cannot  be  rejected.  To  be  sure,  however,  I conduct 
the  same  unit  root  tests  on  the  first  differences  of  prices. 
These  results  are  given  in  tables  2.10-2.13.  Regardless  of 
the  data  set,  the  test  used,  or  the  model  specified,  at  the  5% 
level  of  significance  the  tests  rejects  the  null  hypothesis 
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that  first  differences  of  futures  prices  contain  a unit  root. 
Hence,  one  can  be  more  certain  that  the  levels  of  futures 
Prices  are  integrated  of  order  1 or  that  they  contain  a unit 
root. 


Table  2.10 


Unit  Root  Tests  with  Zero  Mean  and  Trend  under  the  Ho 
Using  First  Differences  of  Prices  (Model  2.2) 


Tests 

88(3J 

88(6) 

88  (9) 

88(12) 

89(3) 

DF 

-21.44 

-21.91 

-22.79 

-22.44 

-21.56 

ADF(4) 

-8.56 

-8.46 

-8.38 

-8.62 

-8.74 

ADF(12) 

-4.11 

-4.25 

-3.96 

-3.81 

-4.21 

Za(4) 

-499.7 

-435.0 

-448.9 

-472.2 

-420.3 

Za(12) 

-536.9 

-469.8 

-490.8 

-511.3 

-426.5 

Zt(4) 

-20.27 

-21.66 

-23 . 37 

-23.25 

-24.18 

Zt(12) 

-19.44 

-20.11 

-21.19 

-21.49 

-23.83 

88(3)-89(3)  denotat  tha  fiva  contracts.  All  tast  statiatics  in  tha  tabla,  axcapt  thoaa 
for  tha  Za  tast,  should  ba  cooparad  to  tha  critical  valuas  found  in  tabla  2.1.  Tha 
critical  valuas  for  tha  Za  tast  should  bs  comparad  to  tha  critical  valuas  found  in 
tabla  2.4. 


Table  2.11 


Unit  Root  Tests  with  Nonzero  Mean  under  the  Ho 
Using  First  Differences  of  Prices  (Model  2.3) 


Tests 

88(3) 

88(6) 

88  (9) 

88(12) 

89(3) 

DF 

-21.43  • 

-21.88 

-22.78 

-22.43 

-21.54 

ADF(4) 

-8.56 

-8.45 

-8.40 

-8.64 

-8.76 

ADF(12) 

-4.09 

-4.26 

-3.98 

-3.86 

-4.20 

Za(4) 

-499.7 

-434.9 

-448.5 

-472.0 

-420.0 

Za(12) 

-536.1 

-469.6 

-488.4 

-509.5 

-425.1 

Zt(4) 

-20.27 

-21.63 

-23.40 

-23.26 

-24.22 

Zt(12) 

-19.45 

-20.10 

-21.29 

-21.56 

-23.95 

Tha  critical  valuas  for  this  tabla  ara  found  in  tablas  2.2  and  2.5. 
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Table  2.12 

Unit  Root  Tests  with  Nonzero  Mean  and  Trend  under  the  Ho 
Using  First  Differences  of  Prices  (Model  2.4) 


Tests 

88(3)  • 

88(6) 

88  (9) 

88(12) 

89(3) 

DF 

-21.41 

-21.85 

-22.75 

-22.40 

-21.56 

ADF(4) 

-8.58 

-8.45 

-8.39 

-8.63 

-8.80 

ADF(12) 

-4.04 

-4.22 

-3.97 

-3.85 

-4.21 

Za(4) 

-499.5 

-435.1 

-448.5 

-472.0 

-419.6 

Za(12) 

-534.5 

-468.9 

-488.2 

-509.6 

-420.9 

Zt(4) 

-20.27 

-21.61 

-23.37 

-23.24 

-24.33 

Zt(12) 

-19.47 

-20.11 

-21.27 

-21.53 

-24.38 

The  critical  values  for  this  table  are  found  in  table  2.3  and  2.6. 


In  the  next  section  we  choose  the  most  appropriate  random  walk 
model . 


Choosing  the  Most  Appropriate  Random  Walk  Model 
Because  the  random  walk  will  be  used  in  the  analyses 
conducted  in  the  next  chapters,  an  ordinary  least  squares 
(OLS)  regression  is  conducted  on  all  five  data  sets  to  suggest 
which  of  models  2. 2-2. 4 is  the  most  appropriate.  The  results 
from  running  the  regression 

Pt  = a + pp^.^  + yt  + ej.  (2.14) 

are  given  in  table  2.13.  The  t-statistic  given  for  ^ is  the 
statistic  (^-l)/a^. 
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Table  2.13 


Choosing  the  Appropriate  Random  Walk  Model  I 


Data 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

A 

a 

1.36 

(1.74) 

5.14 

(3.53) 

5.88 

(3.64) 

1.55 

(1.97) 

1.48 

(1.68) 

0.99 

(-1.19) 

0.95 

(-3.19) 

0.94 

(-3.45) 

0.98 

(-2.39) 

0.98 

(-2.11) 

A 

Y 

.00002 

(0.42) 

.00003 

(0.42) 

-.00001 

(-0.13) 

-.00002 

(-0.46) 

-.00003 

(-0.81) 

T-statistlcs  ara  given  in  parenthaaes. 


In  none  of  the  data  sets  is  it  suggested  to  use  a random  walk 
with  a trend  term.  Under  the  premise  that  futures  prices 
contain  a unit  root  and  because  the  results  of  table  2.13 
indicate  in  four  of  the  five  data  sets  that  is  significantly 
different  from  1 at  conventional  levels  of  significance,  I 
discarded  the  trend  term  and  the  constant  term  when  they  were 
insignificant  and  estimated  the  models  again  by  using  OLS. 
These  results  are  given  in  table  2.14. 


Table  2.14 

Choosing  the  Appropriate  Random  Walk  Model  II 


Data 

88(3) 

88(6) 

88(9) 

88(12) 

89  (3) 

A 

a 

1.60 

(2.02) 

1.70 

(1.94) 

1.00 

(0.00) 

0.98 

(-2.02) 

0.98 

(-1.95) 

1.00 

(-0.58) 

1.00 

(-0.58) 

T-statisties  are  glvan  in  parenthasaa.  I also  astlmatad  tha  simpXa  random  wallc  for  data  sats 
88(6)  and  88(9)  and  in  naithar  casa  could  I rajact  tha  astlmatad  as  baing  significantly 
dlffarant  from  ona. 
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From  table  2 . 14  one  can  clearly  see  that  the  best  random  walk 
model  is  one  without  drift  nor  trend  term.  Because  these 
results  are  also  consistent  with  another  study  conducted  on 
financial  futures  prices  (see  Goldenberg,  1989)  , this  model  is 
the  one  that  will  be  used  in  later  analyses. 


CHAPTER  3 

NONLINEAR  DEPENDENCE 
Introduction 

In  the  last  twenty-five  years,  solutions  to  equilibrium 
asset  pricing  models  have  suggested  two  very  different  asset 
pricing  functions.  Many  researchers,  beginning  with  Samuelson 
(1965) , have  used  these  models  to  propose  that  asset  prices 
behave  as  linear  martingale  processes.  The  linear  martingale 
process  arises  out  of  the  assumptions  of  perfect  capital 
markets  and  an  instantaneous  adjustment  process  or  other  more 
specialized  assumptions  such  as  the  serial  independence  of 
dividend  growth  rates  along  with  constant  relative  risk 
aversion  (see  Ohlson,  1977) . Other  researchers,  Lucas  (1978) 
and  Breeden  (1979),  have  shown  that  general  equilibrium  asset 
pricing  models  are  more  likely  to  be  consistent  with  pricing 
functions  that  are  stochastic  and  nonlinear  if  agents  are  risk 
averse.  Why  should  nonlinear  dependence  or  departures  from 
linear  martingales  be  so  surprising  then,  when  the  assumptions 
under  which  linear  martingale  processes  hold  are, 
comparatively,  so  restrictive?  Actually  they  are  not 
surprising,  but  what  is  is  that  theoretical  extensions,  which 
incorporate  the  assumption  of  risk  aversion,  have  not  been 
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able  to  account  for  the  departures  from  the  martingale  process 
that  one  sees  empirically.  Hence,  tests  and  explanations  of 
market  performance,  based  on  both  linear  and  nonlinear  asset 
pricing  functions  are  inconclusive. 

In  the  recent  past  there  have  been  many  statistical 
contributions  made  to  the  nonlinear  time  series  literature. 
With  these  new  contributions,  to  name  a few,  the  ARCH 
specification  test  (Engle,  1982),  the  ARCH-in-mean 
specification  test  (Engle,  Lilien,  and  Robins,  1987),  Tsay's 
test  for  nonlinear  dependence  (1986),  and  the  BDS  test 
proposed  by  Brock,  Dechert,  and  Scheinkman  (1987),  researchers 
in  both  economics  and  finance  have  turned  again  to 
investigating  the  statistical  properties  found  in  economic  and 
financial  data.  In  studies  of  financial  data,  Scheinkman  and 
LeBaron  (1989)  and  Akgiray  (1989)  found  stock  returns  to  be 
nonlinearly  dependent.  Hsieh  (1989),  and  Papell  and  Sayers 
(1989)  found  that  changes  in  foreign  exchange  rates  are 
nonlinearly  dependent.  In  addition,  both  of  the  latter  papers 
using  changes  in  exchange  rates  and  the  paper  by  Akgiray 
(1989)  using  stock  returns  found  that  these  data  are 
characterized  by  effects  that  may  be  successfully  modelled  by 
ARCH  or  Generalized  ARCH  models. 

The  purpose  of  this  chapter  is  to  test  changes  in  90-day 
U.S.  Treasury  Bill  futures  prices  for  nonlinear  dependence. 
In  chapter  2,  I established  that  futures  prices  are 
nonstationary,  that  is,  that  changes  in  futures  prices  are 
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stationary.  What  has  not  been  settled  is  the  process  by  which 
these  changes  evolve.  If  changes  are  independent,  then  the 
random  walk  model  can  be  justified;  if  not,  then  nonlinear 
dependence  must  be  considered  when  modelling  the  data. 

In  the  next  section,  I investigate  whether  or  not  changes 
in  the  contract's  price  are  independent.  Several  tests,  both 
parametric  and  nonparametric,  are  used  for  this  purpose. 
Next,  in  the  section  entitled  "A  More  Powerful  Test  of 
Dependence,"  a brief  account  of  the  Brock,  Deechert, 
Scheinkman  (1987) , hereafter  BDS,  statistic  is  given  and  then 
used  to  detect  any  dependence  that  the  data  may  contain.  In 
the  section  following  this  one,  three  tests  for  nonlinearity 
are  discussed  and  then  conducted,  namely,  Tsay's  (1986)  test, 
a specification  test  for  Engle's  (1982)  autoregressive 
conditional  heteroskedasticity  (ARCH)  model,  and  the  BDS  test 
applied  to  filtered  data.  These  tests  are  jointly  used  since 
Tsay's  test  has  good  power  against  nonlinear  moving  average 
processes  and  bilinear  models,  ARCH  processes  should  be 
detected  by  the  ARCH  specification  test,  and  if  other  types  of 
nonlinear  dependence  exist,  then  the  BDS  test,  being  a more 
general  test,  should  recognize  them.  In  the  penultimate 
section  I explain  and  differentiate  between  the  types  of 
nonlinearity  that  may  describe  the  data.  For  this  purpose  I 
use  Hsieh's  (1989)  test.  The  siammary  completes  the  chapter. 
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Testing  for  Independence 

The  methods  used  are  both  parametric  and  non-parametric 
tests  based  on  the  time  domain.  Namely,  the  analysis  involves 
the  parametric  Box-Pierce  Q test  and  the  non-parametric 
difference-sign  test,  turning  point  test,  and  a test  based  on 
a mixed  statistic.  Tests  involving  spectral  analysis  or 
filters  present  complexities  and  therefore  are  avoided  (see 
Praetz,  1976). 

Q refers  to  the  Box-Pierce  statistic  and  is  used  to  test 
the  assumption  of  white  noise  disturbances.  The  alternative 
hypothesis  under  this  test,  namely  non-white  noise 
disturbances,  might  seem  rather  vague,  however  the  Q-test  can 
be  derived  as  the  Lagrange  Multiplier  test  against  AR(p)  or  an 
MA(p)  process.  It  is  distributed  x^(k)  where  k is  the  lag. 
K,  the  statistic  of  the  difference-sign  test,  is  the  number  of 
+ signs  in  the  sequence  e^  and  is  asymptotically  distributed 
normally  with  mean  (N-l)/2  and  variance  (N+l)/12  where  N is 
the  length  of  the  sequence.  If  a sequence  is  independent  with 
mean  zero,  then  the  number  of  positive  values  should  not  be 
significantly  different  from  the  amount  of  negative  ones. 
Hence,  the  K-statistic  measures  this  departure  from 
independence.  The  turning  point  test  statistic,  r,  refers  to 
the  total  number  of  runs  up  or  down.  As  opposed  to  the  K- 
statistic,  this  statistic  measures  departures  from 
independence  by  considering  the  sequence  of  positive  and 
negative  values.  It  can  be  the  case  that  there  are  as  many 


42 


positive  as  negative  values,  but  there  are  only  two  runs.  r 
is  also  asymptotically  distributed  normally  with  mean  (2N-l)/3 
and  variance  (16N-29)/90.  The  mixed  test,  sometimes  known  as 
a u-run  statistic,  is  formed  by  combining  the  statistics  of 
the  difference-sign  and  turning  points  tests.  The  statistic 
is  constructed  as  follows: 

(3.1) 

where  the  Z's  are  standardized  forms  of  K and  r respectively. 
The  limiting  distribution  of  Tr  ^ is  x^(2)  since  K and  r are 
asymptotically  independent.  The  intention  of  the  u-run 
statistic  is  to  make  the  test  less  sensitive  to  specific 
patterns  and  more  sensitive  to  general  departures  from 
independence . 

The  results  from  applying  these  tests  to  the  series  e^ 
are  presented  in  Table  3.1. 


Table  3.1 


Tests  For  Independence 


Contracts 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

Q 

(lag 

6) 

11.77 

3.87 

10.22 

13.34* 

14.92** 

Q 

(lag 

12) 

14.58 

10.50 

16.05 

17.40 

21.49* 

Q 

(lag 

18) 

17.51 

11.80 

18.68 

20.47 

23.77 

Q 

(lag 

24) 

20.10 

17.02 

24.49 

26.36 

27.65 

-.55 

-.08 

-.34 

-.40 

.32 

2r 

-2.39** 

-.46 

-.86 

-1.24 

-1.65 

T 

6.01* 

.22 

.86 

1.70 

2.86 

Zjj  and  Zj.  refer  to  the  standardized  forms  of  K and  r respectively.  The  null  hypothesis  under  all 
tests  is  that  successive  price  differences  are  random.  » and  **  denote  a rejection  of  the  Hg  at 
the  5X  and  2. 51  levels  of  sigfilflcance  respectively. 


43 


By  looking  at  table  3.1,  it  is  reasonable  to  accept  that  price 
changes  for  any  of  the  five  contracts  are  independent.  This 
is  not  surprising  for  several  reasons.  For  one,  these  tests 
are  not  very  powerful.  Secondly,  unless  the  distribution  of 
price  changes  is  normal,  these  tests  can  only  verify  that 
these  changes  are  uncorrelated,  not  statistically  independent. 
Thirdly,  time  series  generated  by  nonlinear  moving  average 
models,  threshold  autoregressive  models,  bilinear  time-series 
models,  or  ARCH  models  exhibit  little  or  no  serial  correlation 
even  though  the  time  series  may  be  statistically  dependent 
across  time.  Because  of  these  reasons  and  the  possibility 
that  nonlinear  asset  pricing  equations  may  provide  a better 
description  of  the  evolution  of  some  asset  prices,  these  time 
series  are  tested  for  nonlinearities  in  the  third  section. 

A More  Powerful  Test  of  Dependence:  The  BPS  Test. 

The  BDS  test,  suggested  in  a paper  by  Brock,  Dechert,  and 
Scheinkman  (1987),  also  is  designed  to  detect  departures  from 
independence.  This  test,  however,  as  compared  to  the  tests 
discussed  above,  is  more  powerful.  It  has  the  power  to 
recognize  dependencies  in  underlying  processes  that  are 
nonlinear  as  well  as  linear.  As  compared  to  other  well  known 
tests  of  nonlinear  dependence,  such  as  the  ARCH  specification 
test,  the  BDS  test  is  more  general.  It  is  able  to  discern 
nonlinearities  that  are  often  not  found  when  other  tests, 
which  target  specific  types  of  nonlinearity,  are  used. 
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Formally,  the  BDS  statistic  is  constructed  by  using  the 
correlation  integral.  The  correlation  integral  is  defined  as 


Crf(6,D  = 


- N tfj 


E 


(3.2) 


where  I^(x,y)  = 1 if  I x,y  I < S and  0 otherwise, 


I x,y  I = max  I x^,yM  , S is  the  tolerance  distance  chosen  by 

j 

the  researcher,  d is  the  embedding  dimension,  N = T - d + 1, 
and  T is  the  length  of  the  time  series.  It  is  used  to 
calculate  the  number  of  d-histories  whose  distances  from  one 
another  is  less  than  the  chosen  value  S.  Consider  the  series 
of  successive  price  differences  et.  If  this  series  has  length 
T,  then  it  is  possible  to  create  N = T-(d-l)  subseries  of 
length  d.  If  we  denote  these  subseries,  or  d-histories,  by 
{et“^},  where  {et'*)  = (et,  et+i , . . . , et+d-i ) / then  the  correlation 
integral  can  be  used  as  a measure  of  clustering.  If  the 
subseries  cluster  in  any  dimension,  then  the  correlation 
integral  will  take  on  relatively  larger  values.  From  this 
premise,  BDS  (1987*)  formed  their  statistic.  If  under  the  null 
hypothesis  et  is  independently  and  identically  distributed 
random  variable,  BDS  (1987)  showed  that  the  quantity 
D<j  = {Cd(5) -[Ci((S)  ]**)  should  approach  0 as  T They  also 

show  that  under  the  same  null,  the  statistic 

— D 

BDS(d,6)  = (3.3) 


where  bd  is  the  consistent  estimate  of  the  standard  deviation 
of  the  statistic  T^^^Dd  (for  an  exact  value  for  this  standard 
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deviation  see  Hsieh,  1989),  converges  to  a N(0,1)  variable  as 
T 00.  For  large  values  of  the  BDS(d,5)  statistic,  the  null 
hypothesis,  as  stated  above,  is  rejected. 

The  finite  sample  properties  of  the  BDS  statistic  are 
discussed  in  Hsieh  and  LeBaron  (1988).  To  obtain  the  size  of 
the  statistic  under  the  null  hypothesis,  they  generated 
pseudo-random  numbers  for  the  following  distributions: 

1)  Standard  Normal,  2)  Student-t  with  3 degrees  of  freedom, 
divided  by  73 , 3)  Double  exponential  distribution,  divided  by 
72,  4)  Chi-Square  with  4 degrees  of  freedom,  divided  by  78,  5) 
Uniform  on  (0,273),  6)  Bimodal  mixture  of  normals:  .5  N(3,l) 
+ .5  N(-3,l),  divided  by  7l0.  For  the  sample  size  closest  to 
the  sample  sizes  considered  in  this  paper,  T=500,  they 
considered  embedding  dimensions,  d,  of  2 through  6 for  each  of 
these  distributions.  Using  their  results,  the  value  of  S is 
kept  between  1 and  2 times  the  standard  deviation  of  the  data. 
It  is  only  between  these  values  of  <S  that  the  statistic's 
finite  sample  distribution,  under  all  six  distributions, 
remains  reasonably  close  to  its  limiting  distribution. 
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Table  3.2 


The 

BDS  Tes.t  Statistics  for  Daily  Price  Changes 
in  Contract  88(3) 

d 

5=lxSD 

<S=1.25xSD 

<S=1.5xSD 

(5=1.75xSD  (S=2xSD 

2 

3.9889 

4.2956 

4.2732 

4.2292  4.1517 

3 

4.6422 

4.9603 

5.0371 

5.1754  5.5658 

4 

5.3913 

5.5302 

5.3738 

5.4578  5.9125 

5 

6.1221 

6.1966 

5.9142 

5.9678  6.5182 

SD  denotes  the  standard  deviation  of  the  sample  containing  the  data  for  contract  88(3).  The 

standard  deviation  for  this  series 

is  0.1077.  Each  table  below,  3.3~3.6,  will  use  their 

corresponding  standard  deviation. 

3o  SD  in  table  3 . 3 corresponds  to  the  standard  deviation 

of  the  sample  containing  the  data  for  contract  88(6). 

Table  3 

. 3 

The 

BDS  Test  Statistics  for  Daily  Price  Changes 

in  Contract 

88(6) 

d 

(S=lxSD 

5=1.25xSD 

(S=1.5xSD 

<S=1.75xSD  5=2xSD 

2 

5.2652 

5.4226 

6.3720 

8.3590  9.1464 

3 

5.5696 

5.5757 

6.0939 

7.7403  8.2232 

4 

6.0014 

5.8263 

6.3881 

8.1974  8.9037 

5 

6.4692 

6.2700 

6.7993 

8.5717  9.1674 

See 

note  above. 

ha  standard  deviation  for  this  series 

is  0.1121. 

Table  3 

4 

The 

BDS  Test  Statistics  for  Daily  Price  Changes 

in  Contract 

88(9) 

d 

(S  = 1XSD 

<S=1.25xSD 

5=1.5xSD 

(S=1.75xSD  (5=2xSD 

2 

3.0903 

3.4694 

3.1709 

3.2467  3.9871 

3 

3.2882 

3.7190 

3.4700 

3.3861  3.7738 

4 

3.9425 

4.3936 

4.1401 

4.1259  4.8387 

5 

4.3941 

4.9232 

4.7071 

4.6220  5.3392 

See  note  above.  The  standard  deviation  for  this  series  is  0.1071. 
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Table  3.5 


The  BDS  Test  Statistics  for  Daily  Price  Changes 
in  Contract  88(12) 


d 

5=1XSD 

(5=1. 25XSD 

(5=1. 5xSD 

<5=1.75xSD 

(5=2xSD 

2 

4.0408 

4.7484 

5.2530 

6.7436 

8.8590 

3 

4.0633 

4.8537 

5.4303 

6.2574 

7.5480 

4 

4.4454 

5.2891 

5.8382 

6.6060 

8.0152 

5 

5.2774 

6.0475 

6.4174 

7.1947 

8.7382 

Se«  note  above.  The  standard  deviation  for  this  series  is  0.1086. 


Table  3.6 

The  BDS  Test  Statistics  for  Daily  Price  Changes 

in  Contract  89(3) 


d 

(5=lxSD 

5=1.25xSD 

(5=1.5xSD 

(5=1.75xSD 

<5=2XSD 

2 

1.9596 

2.8260 

3.7682 

4.5710 

4.9149 

3 

2.7427 

3.5656 

4.3150 

4.7623 

4.7898 

4 

3.4330 

4.3962 

5.2114 

5.8230 

6.2462 

5 

4.0287 

4.9942 

5.7441 

6.2903 

6.7551 

See  note  above.  The  standard  deviation  for  this  series  is  0.1033. 


In  tables  3. 2-3. 6 results  from  the  BDS  tests  are  given. 
The  BDS  statistic  strongly  suggests  that  dependence  is 
apparent  in  every  data  set  under  investigation.  The  statistic 
is  significant  at  the  1%  level  for  each  contract,  the  critical 
value  being  2.576,  under  all  embedding  dimensions,  and  for  all 
sizes  of  S chosen,  except  for  d=2  and  «S=lx( standard  deviation) 
in  the  fifth  contract.  Though  these  results  may  seem  quite 
strong,  they  are  consistent  with  other  studies  of  financial 
data.  Hsieh  (1989) , in  his  investigation  of  changes  in 
exchange  rate  data,  also  finds  that  the  BDS  statistics  are 
extremely  significant  for  all  exchange  rates,  under  all 


48 


embedding  dimensions,  and  for  all  sizes  of  5 chosen. 
Scheinkman  and  LeBaron  (1989),  by  means  of  the  BDS  statistic, 
also  report  that  weekly  stock  returns  are  not  i.i.d.. 

Tests  for  Nonlinear  Dependence 
In  this  section  three  tests  for  nonlinear  dependence  are 
conducted.  The  ARCH  specification  test  is  constructed  in  the 
familiar  way.  Under  the  null  hypothesis,  the  squared 
residuals  are  assumed  to  be  white  noise.  The  test  statistic 
is  formulated  by  regressing  the  squared  daily  price  changes  on 
its  lags  and  calculating  the  LM  test  statistic  N * R^  where  N 
is  the  number  of  observations.  The  statistic  is  distributed 
X^(P)  where  p is  the  number  of  lags  in  the  regression. 

Tsay's  test,  under  the  null  hypothesis,  assumes  that 
daily  price  changes  are  i.i.d..  Simulations  in  Tsay  (1986) 
show  that  this  test  has  good  power  against  nonlinear  moving 
average  and  bilinear  models.  The  test  statistic  is 
constructed  in  the  following  way: 

1)  regress  e^  on  a constant  and  lags  e^i,  — ,efj  and  save  the 
residuals  u^ 

2)  regress  e\.i,  on  the  same  lags  as  in 

i)  / Ofi/ • • w ®t-j/  and  save  the  residual  vector 

3)  regress  Ut  on  and  save  the  residual  v^ 

4)  form  the  test  statistic  in  the  following  way: 
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'!»  = 


t tE  cE  cE  J 

c c c 

(E  ^/  [i^^-J■-J-l] } 


F{j,N-J-j-l) 


(3.4) 


where  N is  the  number  of  observations,  J is  dimension  of  the 
lag  in  1)  and  2),  and  j = J(j+l)/2.  The  limiting  distribution 
of  the  test  statistic  is  F( j ,N-J-j-l) . 

The  results  from  applying  the  ARCH  specification  test  and 
Tsay's  test  to  daily  price  changes  are  given  in  tables  3.7  and 
3.8  respectively.  According  to  the  ARCH  specification  test, 
the  data  exhibits  extreme  multiplicative  dependence  for  lags 
3,  4,  and  5.  Nonlinear  dependence  also  appears  in  the  results 
of  the  Tsay  test. 

Table  3.7 


ARCH  Specification  Tests 


Contracts 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

lag  1 

1.79 

1.28 

1.71 

3.45 

2.78 

lag  2 

6.43 

2.20 

1.72 

4.55 

2.75 

lag  3 

9.08 

12.59* 

27.17* 

38.03* 

51.86* 

lag  4 

24.04* 

30.09* 

38.35* 

53.62* 

58.24* 

lag  5 

24.00* 

33.47* 

38.35* 

55.51* 

58.23* 

The  asterisk  in 

tables  3.7  and  3 

8 indicates 

rejection  of  the 

null  at  the  IZ 

Level  of  significance 

Table  3.8 


Tsay's  Tests  for  Nonlinearity 


J=2 

J=3 

J=4 

in 

II 

Contract 

88(3) 

2.309 

3.965* 

5.927* 

4.034* 

Contract 

88(6) 

1.221 

4.219* 

5.751* 

4.136* 

Contract 

88(9) 

.868 

-.361 

4.733* 

3.439* 

Contract 

88(12) 

2.260 

3.263* 

5.519* 

3.816* 

Contract 

89(3) 

1.662 

3.052* 

5.244* 

3.186* 
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These  tests  above,  the  ARCH  specification  test  and  Tsay's 
test,  have  good  power  against  deviations  from  their  specific 
nulls,  but,  they,  like  any  statistical  test,  are  not 
foolproof.  To  lend  emphasis  to  these  results  the  BDS  test  is 
also  applied.  As  mentioned  above,  the  BDS  test  is  a general 
test  of  clustering.  It  distinguishs  random  from  nonrandom 
behavior  by  considering  whether  or  not  a time  series  clusters 
in  dimensions  greater  than  one.  However,  because  the  BDS  test 
may  capture  linear  as  well  as  nonlinear  dependence,  before  I 
can  apply  this  test,  the  data  are  first  purged  of  any  linear 
dependence^.  To  do  this  e^  is  regressed  on  10  of  its  lags  and 
the  residuals  are  then  kept  for  observation.  The  estimated 
coefficients  and  their  corresponding  t-statistics  from  these 
regressions  are  given  in  table  3.9. 

The  results,  on  the  whole,  are  not  surprising.  Besides 
the  fact  that  in  three  of  the  contracts  the  third  lag  has  a 
significant  t-statistic  and  in  one  contract  the  second  lag 
does  at  the  5%  level  of  significance,  the  data  cannot  be 
explained  by  its  lags. 


^ Brock  (1987)  has  shown  that  the  asymptotic  distribution 
of  the  BDS  test  applies  to  residuals  of  linear  regressions. 
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Table  3.9 


Regression  Results  From  a Linear  Purge  of  the  Data 


Contract 

88(3) 

88(6) 

88  (9) 

88(12) 

89(3) 

constant 

. 120 
( .240) 

.011 
( .021) 

-.358 

(-.663) 

-.312 

(-.645) 

-.242 

(-.507) 

lag  1 

. 043 
(.909) 

. 013 
( .266) 

.010 

(.202) 

-.019 

(-.398) 

.002 

(.032) 

lag  2 

.150* 

(3.20) 

.071 

(1.46) 

. 028 
(.559) 

.037 

(.778) 

-.056 

(-1.18) 

lag  3 

-.015 

(-.324) 

-.065 

(-1.33) 

-.128* 

(-2.53) 

-.139* 

(-2.97) 

-.147* 

(-3.12) 

lag  4 

-.062 

(-1.31) 

.0096 

(.196) 

.027 
( .525) 

.0063 

(.133) 

.055 

(1.16) 

lag  5 

-.015 

(-.327) 

-.048 

(-.977) 

-.037 

(-.723) 

.001 

(.211) 

-.032 

(-.667) 

lag  6 

. 066 
(1.40) 

.060 

(1.23) 

. 073 
(1.44) 

.061 

(1.29) 

.066 

(1.39) 

lag  7 

.044 
( .923) 

. 080 
(1.64) 

.075 

(1.48) 

.043 

(.897) 

.039 
( .828) 

lag  8 

-.069 

(-1.47) 

-.082 

‘(-1.67) 

-.069 

(-1.37) 

-.047 

(-.995) 

-.094* 

(-2.00) 

lag  9 

. 121 
( .259) 

.048 

(.983) 

.060 

(1.18) 

.052 

(1.10) 

.023 

(.481) 

lag  10 

.370 

(.789) 

.063 

(1.36) 

.043 

(.851) 

.056 

(1-24) 

.015 
( .328) 

R^ 

.033 

.031 

.037 

.034 

.044 

T-statistics  ara  in  parenthesas.  Tha  astarisk  indicates  significance  at  the  51  level  of 
significance. 


In  tables  3.10-3.14,  the  results  from  applying  the  BDS 
test  to  these  residuals,  or  linearly  purged  daily  price 
changes,  are  given.  As  compared  to  tables  3. 2 -3. 6,  the  BDS 
statistics  are  not  as  significant  and  for  several  embedding 
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dimensions  and  sizes  of  S,  especially  in  contracts  88(9)  and 
89(3),  the  test  statistic,  using  the  same  critical  value  as 
that  above,  is  insignificant  at  the  1%  level  of  significance. 
Nevertheless,  nonlinear  dependence  is  still  an  apparent  part 
of  all  the  contracts  when  the  data  is  considered  in  the  third, 
fourth,  and  fifth  dimensions. 


Table  3.10 


The 

BDS  Test 

Applied  to 
Contract 

the  Linearly  Purged 
88  (3) 

d 

5=lxSD 

5=1.25xSD 

<S=1.5xSD 

5=1.75xSD 

5=2xSD 

2 

3.5623 

3.7048 

3.6264 

3.5316 

3.5157 

3 

4 . 1828 

3.9711 

3.4567 

3 . 1463 

3.0836 

4 

4.9650 

4.6282 

3.9915 

3.6501 

3.6135 

5 

5.7971 

5.3844 

4.6339 

4.2265 

4.3420 

In  tables  3.10-3.1* 

tha  BDS  teat  is 

appliad  to  tha  filtarad  data  producad  by  ratooving  any  auto 

corrslativa  structure.  SD  rafars  to  tha  standard  daviation  of  tha  purgad  contract.  SO  for 
this  sarias  is  0.1068. 

Table  3.11 

The  BDS  Test  Applied  to  the  Linearly  Purged 
Contract  88(6) 

d 

5=1XSD 

5=1.25xSD 

5=1.5xSD 

5=1.75xSD 

5=2xSD 

2 

3.3957 

3.6894 

3.4385 

3.2187 

3.0684 

3 

3.9326 

4.3487 

3.9327 

3.4878 

3.1275 

4 

4.8076 

5.3116 

4.8022 

4.2651 

3.9466 

5 

5.4210 

6.0502 

5.5844 

5.0139 

4.7938 

Sea  note  above.  The  standard  deviation  for  this  series  is  0.10*8. 
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Table  3.12 


The 

BDS  Test 

Applied  to 
Contract 

the  Linearly  Purged 
88(9) 

d 

(5=lxSD 

6=1.25xSD 

(S=1.5xSD 

<S=1.75xSD 

(5=2xSD 

2 

2.4435 

2.6273 

2.2894 

2.2408 

2.6197 

3 

2.8618 

3.1556 

3.0697 

2.8215 

2.9288 

4 

3.6944 

4.1983 

4.2096 

3.8941 

3.9861 

5 

4.2409 

4.9140 

5.0146 

4.6610 

4.7389 

See 

note  above.  The  standard  deviation  for  this  series  is  0.1059. 

Table  3 

.13 

The 

BDS  Test 

Applied  to 

the  Linearly  Purged 

Contract  88(12) 

d 

<S=lxSD 

5=1.25xSD 

<S=1.5xSD 

<S=1.75xSD 

<S=2xSD 

2 

2.5457 

2.8968 

2.9404 

2.8035 

2.2895 

3 

2.7290 

3.i339 

3.5171 

3.4107 

3.2920 

4 

3.4815 

4.1733 

4.4914 

4.4012 

4.3097 

5 

4.1724 

5.0279 

5.4630 

5.3847 

5.2115 

See 

note  above.  Th« 

standard  deviation  for  this  series  is  0.1025. 

Table  3 

. 14 

The 

BDS  Test 

Applied  to 

the  Linearly  Purged 

Contract  89(3) 

d 

<S=lxSD 

5=1.25xSD 

5=1.5xSD 

<S=1.75xSD 

5=2XSD 

2 

.2148 

.9757 

1.8349 

2.4199 

2.4109 

3 

.7160 

1.5176 

2.3759 

2.9271 

2.8889 

4 

1.8713 

2.6158 

3.4325 

3.9639 

4.2048 

5 

2.6073 

3.3522 

4.1733 

4.7390 

4.9973 

Sea  note  above.  The  standard  deviation  for  this  series  is  0.1. 


Differentiating  Between  Additive  and  Multiplicative 

Nonlinear  Dependence 


In  this  section  I examine  this  nonlinear  dependence  a 
little  closer.  Several  theoretical  models  have  recently  been 
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proposed  to  handle  nonlinear  time  series.  Some  of  the  more 
popular  models  are  the  following; 

Robinson  (1979)  suggested  the  Nonlinear  Moving  Average 
(MA)  model.  A simple  example  is 

= Ut  + aUt-iUt-2.  (3.5) 

Tong  and  Lim  (1980)  introduced  the  threshold  autoregressive 
model . An  example  is 

St  = ae^-i  + Ut,  if  e^i  < 1,  (3.6) 

et  = jSet.i  + Ut,  otherwise. 

Granger  and  Andersen  (1978)  proposed  the  bilinear  time-series 
model . 

6t  = Ut  + aet-iUfi.  (3.7) 

In  all  three  models  above,  e^  is  the  daily  price  change,  Ut  is 
a normal , independently  and  identically  distributed  random 
variable  with  mean  0 and  variance  a^,  E (et/et-i,Ut-i)  * 0 and 

Var(et/et-i,Ut-i)  = ^2. 

A modest  example  of  Engle's  (1982)  Autoregressive 
Conditional  Heteroskedastic  (ARCH)  model  is 

St  = Ut  (3.8) 

where  Ut  is  conditionally  normally  distributed  with  mean  0 and 
variance 


ht  = [tto  + CieVi]- 

(3.9) 

The  time-varying 

parameter  (TVP)  model. 

Of  which  a simple 

example  is 

®t  = + Ut 

(3.10) 

)3t  = a + <5Zt  + Vt 
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where  is  some  variable  that  explains  movements  in  /3t, 
Var(uJ  = Var(vJ  = Cov(Ut,vJ  = 0 for  all  t and  s,  and 
Cov(Ut,uJ  = Cov(Vt,vJ  = 0,  for  all  t # s. 

In  both  of  these  models  E (et/efi, Ufi)  = 0 and  Var (et/efi, Ufi) 
is  not  constant  over  time. 

The  data  contains  at  least  one  of  two  types  of  nonlinear 
dependence,  additive  and/or  multiplicative. 

Additive  dependence: 

~ f (®t-l/  • • • 

Multiplicative  dependence: 

(®t-l/  • • • /^t-k/'^t-l/  • • • »^t-k) 

where  v^  is  an  i.i.d.  random  variable  with  zero  mean  and 
independent  of  past  e^'s  and  u^'s,  e^'s  are  the  price 
differences,  u^  are  the  residuals  from  the  linear  regression 
results  given  in  table  3.9,  and  f(  ) an  arbitrary  nonlinear 
function  of  e^.i, . . . ,et-k,Ut-i, . . . ,Ufk,  for  some  finite  k.  We 
differentiate  between  these  types  by  looking  at  the  data's 
conditional  means  and  variances.  If  the  data  solely  exhibits 
additive  nonlinear  dependence,  then  the  dependence  enters  only 
through  the  mean  of  the  process  and  the  conditional  mean  and 
variance  will  be  similar  to  those  expressed  by  the  first  three 
models  above.  If  the  data  solely  displays  multiplicative 
nonlinear  dependence,  then  the  dependence  enters  only  through 
the  variance  of  the  process  and  the  last  two  models  would  be 
candidates  for  modelling  their  evolution. 
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Hsieh  (1989)  developed  a test  to  distinguish  between 
these  two  types  of  nonlinear  dependence  that  is  based  on 
examining  the  conditional  mean  and  variance  of  a time  series. 
The  time  series,  the  first  differences  of  daily  prices,  is 
first  purged  of  linear  dependence  by  using  the  residuals  from 
the  regression  results  given  in  table  3.9.  The  test  is  defined 
in  the  following  way. 

Pvw(i^j)  is  defined  as  E(Vt,Vt-i,Vt-j)/Pv^,  where  v^  are  the 
residuals  from  the  regression  equations  in  table  3.9.  The 
null  hypothesis  is  that  the  process  contains  multiplicative 
nonlinearity.  Note  that  this  implies  that  E(Vt, v^i, Vt-j)/Pv^=0 
for  all  i,j  > 0.  Pwv(i»j)  is  estimated  by 


r^U.j) 

(1/T 


(3.11) 


Under  the  null  hypothesis,  Pwv(i/j)  = 0 and 

yT[(l/T)J:  Vt,Vt-i,Vt.j]  is  asymptotically  normally  distributed 

with  mean  0 and  variance  w^  j = plim  (1/T)E  , v^.^, v^.j^ 

T-too 

provided  that  the  probability  limit  exists.  Given  this, 
rwv(i/j)  is  asymptotically  distributed  N(0,Wi_j/av®)  . Wi_j/ay 
can  be  consistently  estimated  by 

[(1/T)r  vy,Vt.y,Vt./]/[  (1/T)Z  vy]2.  (3.12) 

The  third-order  moment  test,  as  Hsieh  (1989)  calls  it,  is 
designed  to  reject  the  null  hypothesis  only  in  the  presence  of 
additive  nonlinear  dependence.  The  test  statistic  is 
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Hi,j  = 7t  r^(i, (3.13) 
A rejection  of  the  null  for  a two-tailed  test  at  the  1%  level 
of  significance  is  found  if  the  absolute  value  of  ^ is 
larger  than  2.576.  The  test  is  applied  to  the  futures  data 
for  p of  5 lags  for  both  i and  j . i The  results  are  presented 
in  table  3.15  below. 

These  results  indicate  that  multiplicative  dependence  is 
the  type  found  in  the  data  for  all  contracts  except  88(9). 
Since  the  null  is  rejected  for  every  lag  except  (1,1)  when 
testing  contract  88(9),  I conclude  that  there  exists  additive 
nonlinearity  in  this  data  set.  However,  the  results  from  the 
ARCH  specification  test  above  suggests  that  multiplicative 
dependence  is  also  a part  of  this  data  set.  Therefore,  I 
conclude  that  all  data  sets  contain  at  least  multiplicative 
dependence  and  some  may  contain  additive  dependence. 
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Table  3.15 


Hsieh's  Test  to  Distinguish  Betweeen  Additive  and 
Multiplicative  Nonlinear  Dependence 


Lags 

i/ j 

Contracts:  88(3) 

88(6) 

88  (9) 

88(12) 

89(3) 

1,1 

.395 

.308 

2.572 

.280 

.979 

2,1 

.521 

.764 

10.720 

1.601 

.926 

2,2 

.958 

.832 

5.864 

. 552 

.550 

3,1 

1.680 

1.563 

14.638 

1.235 

.804 

3,2 

.696 

.839 

9.740 

1.328 

.726 

3,3 

. 627 

.776 

7.353 

.764 

.810 

4,1 

1.585 

1.554 

14.914 

1.475 

1.542 

4,2 

1.039 

1.326 

14.967 

1.600 

.759 

4,3 

1.064 

.740 

8.732 

1.183 

1.144 

4,4 

1.195 

1.195 

11.750 

1.155 

1.207 

5,1 

.813 

.459 

4.007 

.699 

.272 

5,2 

1.817 

.960 

7.785 

.764 

.598 

5,3 

1.234 

.996 

8.495 

.790 

.304 

5,4 

.655 

.493 

6.693 

.754 

.945 

5,5 

.876 

1.016 

10.856 

.895 

1.132 

Note:  The  results  presented  are  the  absolute  values  of  the  test  statistics. 


Summary 

There  is  mounting  evidence,  Hsieh  (1989),  Scheinkman  and 
LeBaron  (1989),  and  Papell  and  Sayers  (1989),  that  asset- 
prices  contain  nonlinear  dependencies  that  are  not  modelled  by 
linear  asset-pricing  functions  nor  considered  by  asset 
traders. 

This  chapter  shows  that  dependencies  are  found  in  the 
changes  of  futures  prices.  Evidence  from  the  BDS  test,  Tsay's 
test  for  nonlinearity,  and  the  ARCH  specification  test 
indicates  that  this  dependence  is  nonlinear.  Hsieh's  test, 
the  third-order  moment  test,  suggests  that  the  nonlinear 
dependencies  are  primarily  found  in  the  variances  of  the  data. 


CHAPTER  4 
PREDICTION 

Introduction 

In  chapter  2,  using  several  tests  for  nonstationarity , I 
concluded  that  Treasury  Bill  futures  prices  contain  a unit 
root.  In  chapter  3,  first  differences  of  the  data  were  tested 
for  nonlinear  dependence.  Relying  on  the  results  from  several 
tests,  an  part  of  the  behavior  of  Treasury  Bill  futures  prices 
appears  to  be  explained  by  nonlinear  dependence.  Hsieh's 
(1989)  test  indicated  that  the  type  of  nonlinear  dependence  is 
multiplicative,  i.e.,  the  nonlinearity  enters  through  the 
variance  of  the  process.  However,  when  using  Tsay's  (1986) 
test,  which  has  good  power  against  processes  that  contain 
additive  nonlinear  dependence  (nonlinear  dependence  that 
enters  through  the  mean  of  the  process) , nonlinear  dependence 
was  found.  Hence,  it  is  very  possible  that  the  data  contain 
both  additive  and  multiplicative  nonlinear  dependence. 

In  this  chapter  the  issues  of  modelling  nonlinear  time 
series  and  nonlinear  prediction  are  addressed.  Because  of  the 
results  in  the  previous  chapter,  the  data  are  presumed  to 
contain  nonlinear  dependencies.  Hence,  I first  model  the  data 
using  several  nonlinear  models  and  then  compare  them  against 
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a simple  random  walk  process  and  each  other  by  considering 
their  predictive  power. 

So  far,  this  concern  has  not  been  addressed  in  the 
futures  markets'  literature,  but  recently,  in  the  exchange 
literature,  this  issue  has  emerged.  The  conclusions  are 
mixed  and  depend  upon  the  particular  models  used  to  capture 
the  nonlinearities.  Meese  and  Rose  (1989)  find  that  the  poor 
explanatory  power  that  several  popular  exchange  rate  models 
exhibit  cannot  be  attributed  to  nonlinearities  arising  from 
time  deformation  or  improper  functional  form.  Diebold  and 
Nason  (1990)  nonparametrically  estimate  the  conditional  mean 

A 

functions  of  ten  major  exchange  rates  using  a technique  known 
as  "locally  weighted  regression."  They  conclude  that 
considering  nonlinearities  does  not  help  point  prediction.  On 
the  other  hand,  by  modelling  exchange  rate  dynamics  as  a 
sequence  of  stochastic,  segmented  time  trends,  Engel  and 
Hamilton  (1990)  find  that  nonlinear  dependence  may  be 
exploitable  for  predictive  purposes.  They  show  that 
stochastic,  segmented  trends  model  predicts  better  than  a 
random  walk. 

In  this  chapter,  I compare  the  predictions  of  a simple 
random  walk  process,  i.e., 

= -Pc-i  , where  e{-N(0,a^)  (4.1) 

with  those  of  an  ARCH-in-Mean  (ARCH-M)  model  (Engle,  Lilien, 
and  Robins,  1987) , Generalized  Autoregressive  Conditional 
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Heteroskedastic-in-Mean  (GARCH-M)  model  (Bollerslev,  1986) , 
the  bilinear  model  (Granger  and  Andersen,  1978a) , a Time 
Varying  Parameter,  model  (TVP) , a Time  Series  Segmentation 
Model  (Sclove,  1983) , and  a Stochastic,  Segmented  Trends  model 
(Hamilton,  1989)  below. 

The  chapter  is  organized  as  follows.  The  next  section 
describes  in  detail  and  justifies  each  of  the  nonlinear  models 
considered.  After  this  discussion,  a section  is  devoted  to 
fitting  the  models  to  the  data.  Some  diagnostic  tests  on  the 
residuals  of  the  estimated  models  are  also  conducted  in  this 
section.  Prediction  and  comparison  of  the  models  are  taken  up 
sfter  this  and  the  last  section  summarizes  the  chapter. 

A Look  at  Some  Nonlinear  Time  Series  Models 

Six  nonlinear  models  are  considered  in  this  section. 
These  models  are  the  ARCH-in-Mean  (ARCH-M)  model  (Engle, 
Lilien,  and  Robins,  1987) , Generalized  Autoregressive 
Conditional  Heteroskedastic-in-Mean  (GARCH-M)  model 
(Bollerslev,  1986),  the  bilinear  model  (Granger  and  Andersen, 
1978a) , a Time  Varying  Parameter  model  (TVP) , a Time  Series 
Segmentation  Model  (Sclove,  1983) , and  a Stochastic,  Segmented 
Trends  model  (Hamilton,  1989) . 

The  ARCH-M  and  GARCH-M  Models 

The  ARCH-M  and  GARCH-M  models  both  use  a function  of  the 
conditional  variance  of  a time  series  to  explain  the  mean  of 
its  process.  Consider  the  series  P^  to  be  modeled  as 
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APt  = <S  + + ©t/  where  e^  is  an  error  term  with  zero  mean 
and  conditional  variance  ht  = E(et^jlt-i)  and  1^1  is  the  set  of 
all  information  available  at  time  t.  A specific  form  of  the 
conditional  variance 

<7 

= «o  E / (^*2) 
Jc-1 

proposed  by  Engle,  Lilien,  and  Robins  (1987),  is  known  as  the 
ARCH-M(q)  model.  Bollerslev  (1986)  generalized  this  form  by 
allowing  lagged  values  of  the  conditional  variance,  in 
addition  to  lagged  squared  residuals,  to  explain  its 
contemporaneous  value,  i.e., 

= «o  + EPA-i  ^ 

k-l 

When  this  form  is  used  the  model  is  known  as  the  generalized 
ARCH-M  model  or  GARCH-M(p,q)  model.  The  parameters  of  both 
models  satisfy  the  following  conditions  when  appropriate: 
ao>0,  flk,  )3j  >0,  k=l,...,q,  j=l,...,p. 

The  empirical  distribution  of  the  variables  generated  by 
these  processes  are  heavy  tailed,  compared  to  the  normal 
distribution.  The  unconditional  mean  and  variance  of  an  ARCH- 
M and  GARCH-M  process  are  constant,  equal  to 


(1  - E 

k-l 


(4.4) 


respectively,  but  the  conditional  mean  and  variance  are  time 
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^ 

- E Pi  ■ E “it) 

i-i  ic-i 


(4.5) 


dependent  as  shown  above.  The  fact  that  conditional  variances 
are  allowed  to  depend  on  past  realized  variances  is  consistent 
with  the  actual  volatility  pattern  observed  in  most  financial 
markets  during  both  stable  and  unstable  periods. 

ARCH  and  GARfcH  models  have  been  sucessfully  applied  to 
foreign  exchange  rate  data  by  Domowitz  and  Hakkio  (1985) , 
Diebold  and  Pauly  (1988),  and  Hsieh  (1989),  and  to  stock 
market  data  by  Akgiray  (1989) . Engle,  Lilien,  and  Robins 
(1987)  fruitfully  applied  the  ARCH-M  to  expected  bond  returns 
and  Engle  and  Bollerslev  (1986)  used  a GARCH-M  model  to  model 
the  risk  premium  on  the  foreign  exchange  market. 

The  Bilinear  Model 

The  bilinear  model,  proposed  by  Granger  and  Andersen 
(1978a),  was  introduced  as  a simple  generalization  to  linear 
models.  This  class  of  nonlinear  models  may  be  regarded  as  the 
natural  nonlinear ‘extension  to  Autoregressive  Moving  Average 
(ARMA)  processes.  Just  as  the  ARMA  process  is  sufficiently 
general  to  approximate  most  linear  series  that  arise  in  the 
real  world,  the  introduction  of  the  bilinear  model  marked  the 
beginning  of  work  in  time  series  analysis  concerned  with 
finding  a general  nonlinear,  univariate  model.  The  bilinear 
model  is  not  dramatically  nonlinear,  however  the  bilinear 
class  of  models  are  non-explosive  and  invertible  and  useful  in 
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forecasting.  Granger  and  Andersen  (1978a)  applying  simple 
bilinear  models  to  IBM  daily  common  stock  closing  prices^  and 
Gabr  and  Rao  (1981)  applying  a bilinear  model  to  Canadian  Lynx 
data^  both  show  the  ability  of  the  bilinear  model  to  forecast. 
Maravall  (1983)  shows  that  bilinear  models  are  able  to  improve 
upon  the  Bank  of  Spain's  linear  ARIMA  forecasts  of  currency 
demand. 

In  this  chapter  I consider  a specific  first-order 
bilinear  model  motivated  by  the  following: 

In  futures  markets  literature,  it  has  long  been  accepted  that 
futures  prices  follow  a simple  random  walk  process.  That  is 

Pc  = Pfi  * >^here  e "N{Q,a^)  . 

Although  I concluded  that  Treasury  Bill  futures  prices  contain 
a unit  root,  some  doubt  was  cast  on  this  specification  in 
chapter  3.  There,  it  was  shown  that  e^  exhibits  nonlinear 
dependence  if  is  modeled  as  a random  walk.  This  leads  me 
to  believe  that  the  specification  may  be  more  reasonable  if 
the  expectation  of  at  time  t-1  is  permitted  to  be  a 
nonlinear  function  of  past  information. 

If  a series  P^  is  generated  by 

Pt  = (expectation  of  P^  made  at  time  t-1)  + et 
so  that  et  is  essentially  the  expectation  error,  and  if  these 

^ These  are  closing  prices  for  169  trading  days  beginning 
May  17,  1961. 

^ This  data  gives  the  number  of  lynx  trapped  annually 
rather  than  the  actual  population. 
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expectations  are  a function  of  the  most  recent  data  available 
at  time  t-1,  that  is  Pfi  and  e^-i,  of  the  form 

^ (Pt  I Pt-i»  ®t-i)  “ *?  (Pfi  / ®t-i) 

then  there  is  no  reason  to  believe  that  this  function  will  be 
linear.  One  way  of  picking  up  at  least  part  of  the 
nonlinearity  is  to  use  the  approximation 

g(P,e)  = aP  + bPe  + de 

which  gives  a bilinear  model  for  the  series  Pf 
Note  that  the  specified  approximation  allows  for  both  "main 
effects,”  aP  and  de,  and  an  "interaction”  or  "cross-impact” 
effect  bPe.  The  first-order  bilinear  model  that  results  for 
futures  prices  is 

Pc  = 3Pc-i  * * ©c/  (<•«) 

where  e^  is  the  usual  white  noise  series. 

A Time-Varvina  Parameter  Model 

In  another  attempt  to  model  nonlinearity  a simple  time- 
varying  model  is  used.  The  intuition  behind  its  use  is  that 
if  a market  is  not  fully  efficient,  as  I concluded  about  the 
Treasury  Bill  futures  market  in  chapter  3 due  to  the  nonlinear 
dependence  found  in  the  data,  then  not  all  of  a market's 
relevant  information  will  be  disclosed  by  its  price.  If  the 
price  of  a security  does  not  contain  all  of  the  market's 
relevant  information,  then  dependent  upon  the  importance  of 
the  information  not  contained  in  the  price,  it  is  possible 
that  the  reaction  of  today's  price  to  yesterday's  will  be 
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different  across  time.  Hence,  I propose  the  following  model: 


Pc 

= ^cPc-1  * e,  , 

where 

(0,  o^) 

Pt 

= a'P,.,  + , 

where 

Uj.  ~i . i . d. 

CD 

O 

The  equation  representing  the  evolution  of  )3t  reflects  a 
learning  process  on  the  part  of  market  participants. 

A Time-Series  Segmentation  Model 

In  addition  to  the  models  proposed  above,  there  are  other 
paradigms  that  are  specifically  used  for  nonlinearly  dependent 
variables  and  may  explain  the  nonlinear  dependence  found  in 
futures  prices.  One  model  that  can  be  easily  imagined  is  a 
time-series  segmentation  model  which  hypothesizes  that  the 
changes  in  prices  conform  to  one  of  two  processes  where  the 
processes  are  dependent  upon  particular  states  of  nature.  An 
explanation  for  why  this  model  may  be  appropriate  for  futures 
prices  is  the  same  as  the  explanation  given  for  the 
appropriateness  of  TVP  model  though  in  the  segmentation  model 
the  dependence  of  today's  price  on  yesterday's  is  more 
systematic. 

The  model  that  I use  is  a specific  form  of  the  model  that 
Sclove  (1983)  proposed.  I assume  that  there  are  two  states  of 
the  world,  y=l,2,  and  the  changes  in  futures  prices  follow  a 
second-order  autoregressive  process  under  each  state,  i.e., 

AP,  = + Py^P,_2  . 

Changes  in  futures  prices  are  modeled  because  an  assumption  of 
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the  model  is  that  the  variable  under  analysis  is  covariance- 
stationary. The  first-order  autoregressive  parameter  is 
assumed  to  be  positive  and  negative  under  states  1 and  2 
respectively  and  the  second-order  parameter  is  not  constrained 
to  be  either  positive  or  negative  under  either  state.  The 
model  assumes  that  the  residuals  from  the  autoregressive 
processes  under  each  state  are  normal  processes  with  constant 
and  equal  variances  between  states. 

The  algorithm  begins  by  setting  initial  values  of  each  of 
the  autoregressive  processes  and  setting  the  transition 
probabilities,  p^j,  where  c and  d indicate  the  previous  and 
current  state  respectively,  equal  to  1/2.  Also,  the 
probability  that  the  initial  state  of  the  world,  f(Yi),  is 
state  1 is  set  to  1/2.  With  these  initial  values,  the  first 
state  of  the  world  is  estimated  by  maximizing  f (Yi)  f (Pi  | Yi)  ^ 
where 

2 

^(Pi|Yi  = C’)  = (2:ro2) exp(-— (4*8) 

2a^ 

and 


* 5c) 

if  estimating  the  first-order  autoregressive  case.  From  the 
second  state  onward  the  states  of  the  world  are  estimated  by 
maximizing 
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^Y.-xY/(^nlYc)  • (4.9) 

Once  the  states*  are  labeled,  the  parameters  from  the 
autoregressive  processes  can  be  estimated  by  separating  the 
data  according  to  the  two  states  and  maximizing  the  likelihood 
function: 


^ = PiiPi2p2iP22^'^'^  (2na^)  ^ ) 

20^ 


(4.10) 


where 

g = E » E 

Yt-l  Yt-2 

The  transition  probabilities  can  be  estimated  by  where 

n^d  indicates  the  number  of  times  the  state  changes  from  state 
c to  state  d and  n^  indicates  the  number  of  times  the  process 
is  labeled  by  state  c.  With  new  transition  probabilities  and 
new  autoregressive  parameter  estimates,  the  states,  y,  may  be 
reestimated.  Finally  when  no  observation  changes  labels  from 
the  previous  iteration,  the  algorithm  stops. 

The  estimation  procedure  is  based  on  what  the  likelihood 
function  would  have  been  if  states  were  observable. 
Implicitly  then,  it  is  assumed  that  the  actual  historical 
states  of  nature  are  those  that  maximize  the  joint  likelihood 
of  the  changes  in  futures  prices  and  states  which  produce  the 
prices. 
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A Stochastic.  Segmented  Trends  Model 

Hamilton  (1989)  introduced  another  approach  to  modeling 
changes  in  regime.  The  stochastic  specification  is  similar  to 
that  explored  by  Sclove  (1983)  above,  although  the  statistical 
approach  is  quite  different.  The  general  idea  is  to  decompose 
a nonstationary  time  series  into  a sequence  of  stochastic, 
segmented  trends.  The  model  postulates  the  existence  of  an 
unobserved  state  or  regime  variable,  St,  that  is  presumed  to 
depend  on  past  realizations  of  APt  and  S only  through  S^i. 
When  St=0,  the  observed  change  in  the  futures  price  is 
presumed  to  have  been  drawn  from  a N(/Xo,ao^)  distribution, 
whereas  when  St=l,  APt  is  distributed  N(/ii,ai^)  ; thus  when  St=0, 
the  trend  in  the  futures  price  is  (Xq,  whereas  when  St=l,  the 
trend  is  M:*  Discrete  shifts  in  the  parameters  of  the 
distribution  of  futures  prices  are  viewed  as  the  outcome  of  a 
first-order,  discrete-state  Markov  process  which  governs  the 
transition  between  states. 


P(St=0 

|St-i=0) 

= Pll 

P(St=l| 

St-1=0) 

= 1-pll 

(4.11) 

P(St=0 

~ 1”P22 

P(St=lj 

~ P22* 

Given  the  parameters  of  the  distribution,  where  it  is 
assumed  that  the  first  and  second  moments  completely  describe 
the  distribution,  and  a Markov  process  describing  the 
transition  probabilities  from  one  state  of  nature  to  another, 
the  state  to  which  the  segment  of  the  series  belongs  is 


determined. 
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Hamilton's  statistical  approach  differs  from  Sclove's  in 
that  the  actual  marginal  likelihood  function  of  the  variable 
is  found  and  then  maximized  with  respect  to  the  population 
parameters.  The  algorithm  used  to  optimize  the  likelihood 
function  relies  on  the  EM  principle  of  Dempster,  Laird,  and 
Rubin  (1977)  and  is  known  as  the  EM  algorithm.^  The  advantage 
of  the  EM  algorithm  over  other  algorithms  developed  for 
numerical  optimization  is  that  it  is  robust  to  initial 
values.'"  Once  the  optimal  values  of  the  parameters  are  found, 
the  parameters  along  with  the  data  are  used  to  draw  the 
statistical  inferences  about  the  unobserved  states.  Recall 
that  Sclove  (1983)  calculated  what  the  likelihood  function 
would  have  been  if  the  regimes  were  observable,  and  then 
assumed  that  the  actual  historical  states  were  those  that 
would  make  the  joint  likelihood  function  of  the  changes  in 
futures  prices  along  with  unobserved  states  as  large  as 
possible. 

I use  Hamilton's  method  to  estimate  two  different  model 
specifications.  In  the  first  specification,  futures  price 
changes  are  assumed  to  follow  an  autoregressive  process  as  in 


^ The  EM  algorithm  is  an  iterative  procedure  composed  of 
two  steps,  the  expectation  step  (E-step)  and  the  maximization 
step  (M-step) . Hence,  its  name. 

The  EM  algorithm  also  avoids  problems  that  are 
associated  with  likelihood  functions  of  switching  regression 
models  which  are  characterized  by  having  many  local  maxima, 
singularities,  and  boundary  problems. 
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Sclove  (1983).  In  the  second  model,  price  changes  are  assumed 
to  be  independent  as  in  Engel  and  Hamilton  (1990) . 

Estimation  of  the  Models 

In  this  section  I give  the  results  from  the  estimation  of 
the  models  hypothesized  in  the  section  above.  Because  I will 
consider  out-of-sample  prediction  in  the  next  section,  the 
last  50  observations  in  every  data  set  are  not  used  for 
estimation. 

The  ARCH-M  and  GARCH-M  Models 

I started  by  fitting  the  ARCH-M (p)  model  to  the  five 
contracts.  To  fit  a model,  the  value  of  p,  the  lag  in  the 
variance  equation,  is  prespecified.  The  log  likelihood 
function,  given  by 

T 

L(4>)  = r-^5^Lc(4>)  (4.12) 

C-1 


where  0 ' = (y , oto, a^, . . . , Op)  and  the  constant  term  omitted,  is 
then  maximized  with  respect  to  0.  Maximization  of  the 
likelihood  function  is  carried  out  using  the  Berndt,  Hall, 
Hall,  Hausman  (1974)  numerical  optimization  technique.  For 
large  samples,  such  as  the  data  representing  a Treasury  Bill 
futures  contract,  choice  of  the  initial  values  of  the 
parameters  is  not  crucial. 
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Several  specifications  were  tried  using  one  to  five  lags 
in  the  variance  equation  and  using  the  standard  deviation  and 
the  variance  in  the  mean  equation.  To  choose  the  appropriate 
specification  a standard  likelihood  ratio  statistic  was  used. 
If  L(0n)  and  L(0a)  are  the  maximum  likelihood  function  values 
under  the  null  and  alternative  hypothesis  respectively,  then 
the  statistic 

-2[L(4>n)  (4.13) 

where  k,  the  degrees  of  freedom,  is  the  difference  in  the 
number  of  parameters  under  the  null  and  the  alternative. 

Table  4.1  gives  the  results  of  fitting  the  ARCH-M  model 
to  the  data.  The  t-statistics  are  in  parentheses.  In 
addition  to  estimating  the  models,  several  diagnostic  tests 
were  conducted.  "First,  a K-2  degree  of  freedom  likelihood 
ratio  test,  where  K is  the  number  of  parameters  estimated  in 
the  model , for  the  null  hypothesis  that  the  endogenous 
variable  follows  a normal  model  with  constant  mean  and 
variance  is  conducted.  This  test  is  applied  as  a necessary 
condition  for  applying  the  ARCH-M  model  to  the  data.  Second, 
both  the  coefficients  of  skewness  and  kurtosis  of  the 
standardized  residuals,  i.e.,  e^/yh^,  are  given  as  an  informal 
check  of  goodness  of  fit.  If  the  model  fits  well,  then  the 
standardized  residuals  should  satisfy  the  assumptions  made 
before  estimation.  With  respect  to  the  their  third  and  fourth 
central  moments,  this  means  that  the  coefficients  of  skewness 
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Table  4.1 

ARCH-M(p)  Model  Estimates 


Parameter 

88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

S 

-0.048 

(-5.23) 

— 

0.055 

(4.44) 

— 

— 

y 

0.641 

(6.50) 

0.037 
( .822) 

-0.637 

(-4.65) 

-0.015 

(-.288) 

-0.011 

(-.199) 

0.004 

(7.93) 

0.004 

(8.76) 

0.006 

(10.90) 

0.005 

(9.24) 

0.005 

(10.81) 

Q!l 

0.008 

(0.320) 

0.060 

(1.89) 

0.483 

(11.90) 

0.117 

(2.32) 

0.129 

(2.52) 

“2 

0.057 

(1.86) 

0.049 

(2.03) 

0.048 

(1.11) 

0.051 

(2.10) 

0.033 

(1.95) 

Ofa 

0.187 

(3.43) 

0.117 

(2.14) 

— 

0.092 

(1.65) 

0.073 

(1.73) 

0.444 

(7.12) 

0.314 

(8.89) 

— 

0.366 

(6.74) 

0.336 

(6.09) 

LR(K-2) 

170.57 

124.45 

78.922 

121.29 

117.16 

Skewness 

-0.115 

-0.486 

0.233 

0.185 

0.341 

Kurtosis 

3.867 

8.179 

7.544 

5.989 

6.307 

L-Bi 

26.29 

8.630 

9.731 

8.665 

9.085 

L-B2 

18.29 

4.881 

16.71 

8.453 

5.726 

and  kurtosis  of  the  standardized  residuals  should  be 
approximately  equal  to  0 and  3 respectively.  Lastly,  two 
Ljung-Box  statistics  for  twelfth-order  serial  correlation, 
both  x"(12),  are  conducted.  The  first  tests  the  normalized 
residuals  and  the  second  tests  the  squared  normalized 
residuals.  They  are  denoted  by  L-B^  and  L-B^  respectively. 


The  purpose  of  these  last  two  tests  is  to  see  whether  or  not 
after  modeling  the  residuals  contain  any  linear  dependence  or 
any  ARCH  effects.  If  these  tests  fail  to  reject  the  null, 
then  they  are  indicators  that  the  ARCH-M  models  fit  the  data 
well. 

In  every  data  set,  the  likelihood  ratio  tests  demonstrate 
that  the  AP^  does  not  have  a constant  mean  and  variance,  that 
is,  APt  is  better  described  by  a model  that  allows  for 
variation  in  the  mean  and  variance  of  the  process.  Again, 
for  every  data  set,  the  skewness  coefficient  is  not  very 
different  from  that  found  under  a normal  distribution,  but  the 
coefficient  of  kurtosis  is  somewhat  high  indicating  that  the 
standardized  residuals  have  distributions  that  have  very  heavy 
tails.  Although  it  is  assumed  that  the  standardized  residuals 
should  resemble  standardized  normal  random  variables,  the 
ARCH~M  model  is  not  designed  to  model  dependence  found  in 
moments  greater  than  the  second-order.  Lastly,  in  every  data 
set,  the  Ljung-Box  statistics  are  insignificant  at  the  5% 
level  of  significance  except  for  the  test  of  linear  dependence 
in  contract  88(3). 

A natural  generalization  to  the  ARCH-M  model  is  the 
GARCH— M model.  As  in  the  ARCH-M  model,  several  lag  structures 
were  tried  for  the  variance  equation  and  for  every  lag 
structure  tried,  the  mean  equation  was  estimated  with  and 
without  a constant  term.  The  GARCH-M  is  estimated  the  same 
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way  as  the  ARCH-M  model  except  that  (p*  = {y  , Uq  , a^,  . . . , 

where  the  )3's  are  the  coefficients  on  the  lagged 
ht's  in  the  variance  equation.  Once  again,  to  find  the 
correct  model  specification  for  the  data  the  likelihood  ratio 
test  described  above  was  used.  For  every  data  set,  the  GARCH- 
M(l,l)  specification  appeared  to  fit  the  data  best.  In 
addition,  for  every  data  set,  a mean  equation  without  a 
constant  term  seemed  to  be  more  appropriate  than  a mean 
equation  with  one.  When  compared  to  the  ARCH-M  model,  a mean 
equation  without  a constant  term  may  appear  inconsistent. 
However,  because  the  GARCH-M  contains  a lagged  variance  term 
in  the  variance  equation,  the  conditional  variance  at  time  t 
may  contain  the  information  that  the  constant  term  proxied  for 
in  the  ARCH-M  specification.  Hence,  the  constant  term  may  not 
be  necessary  in  tlje  GARCH-M  specification. 

Table  4.2  gives  the  results  of  fitting  the  GARCH-M  model 
to  the  data.  The  t-statistics  are  in  parentheses.  In  each 
data  set  the  likelihood  ratio  tests  again  indicate  that  the 
APt  is  described  better  by  a model  that  allows  for  variation 
in  the  conditional  mean  and  variance  of  the  process  over  time. 
The  coefficients  of  skewness  are  slightly  higher  than  those 
reported  under  the  ARCH-M  models,  but  they  still  are  not  too 
far  from  what  is  acceptable  for  normal  processes.  Again,  the 
coefficients  of  kurtosis  are  somewhat  high.  This  may  be  for 
the  same  reason  mentioned  above.  For  every  data  set,  the 
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Table  4.2 

GARCH-M(p,q)  Model  Estimates 


Parameter 

88(3) 

88(6) 

88  (9) 

88(12) 

89(3) 

Y 

0.017 

(.343) 

0.019 
( .378) 

-0.031 

(-.610) 

-0.043 

(-.906) 

-0.050 

(-1.06) 

“o 

0.0003 

(2.40) 

0.0004 

(2.87) 

0.0009 

(2.83) 

0.0013 

(3.89) 

0.0014 

(3.67) 

01 

0.779 
(26.1) . 

0.836 

(40.5) 

0.731 

(14.1) 

0.696 

(12.3) 

0.6903 

(11.2) 

0.221 

(8.95) 

0.128 

(11.0) 

0.195 

(7.45) 

0.167 

(7.63) 

0.139 

(8.67) 

LR(K-2) 

148.97 

106.44 

94.973 

102.37 

95.66 

Skewness 

-0.172 

-0.126 

0.465 

0.487 

0.706 

Kurtosis 

6.129 

10.86 

8.510 

8.411 

9.590 

L-Bi 

9.15 

6.86 

7.980 

7.955 

8.438 

L-Ba 

21.43 

6.27 

8.765 

6.914 

6.174 

Ljung-Box  test  statistics  are  insignificant  at  the  5%  level  of 
significance. 


The  Bilinear  Model 

Next,  the  data  were  modeled  by  a bilinear  process.  As 
Subba  Rao  (1977)  pointed  out,  the  problem  of  estimating  the 
parameters  of  a bilinear  model  does  not  differ,  in  principle, 
from  that  of  estimating  the  parameters  of  a linear  model. 
Thus,  if  one  assumes  that  e^  is  a Guassian  process,  then  given 
observations  on  the  series  for  t=l,...,T,  the  likelihood 
function  may,  for  large  T,  be  written  approximately  as, 
where  0 denotes  the  set  of  parameters  (a,  b,  d)  and  where,  for 
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L(0)  = exp[^  (4.14) 

2ol  ^-1 

each  value  of  0,  may  be  computed  recursively  from 

~ ^^t-i  ~ ~ ^®c-i  • (4.15) 

The  maximum  likelihood  estimate,  0,  is  therefore  obtained  by 
minimizing 


V^(0) 


with  respect  to  each  element  of  0.  Given  a set  of  initial 
estimates,  dg,  a standard  Newton-Raphson  iterative  technique 
may  be  used  to  find  the  value,  d,  which  minimizes  V(0)  . 

Several  initial  values  were  chosen  for  the  parameters; 
however,  for  the  data  sets  considered  in  this  paper,  they  did 
not  significantly  change  the  final  estimates  of  a,  b,  and  d. 
In  addition,  the  model  gave  estimates  for  the  parameter  a that 
were  very  close  to  1.  Hence,  it  was  considered  parsimonious 
to  instead  estimate  the  model 


Thus,  the  estimates  and  in  parentheses  their  corresponding  t— 
statistics  of  the  parameters  b and  d are  given  in  table  4-3. 
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Table  4.3 

Bilinear  Model  Estimates 


Parameters 

88(3) 

88(6) 

88  (9) 

88(12) 

89(3) 

b 

-0.081 

(-1.210) 

-0.078 

(-0.968) 

-0.014 

(-0.175) 

0.038 

(0.556) 

-0.059 

(-0.682) 

d 

7.573 

(1.222) 

7.273 

(0.972) 

1.318 

(0.177) 

-3.609 

(-0.566) 

5.398 

(0.685) 

skewness 

1.457 

0.561 

0.960 

1.406 

1.275 

kurtosis 

16.991 

19.642 

18.038 

20.149 

19.799 

W 

1.719 

1.509 

1.146 

1.139 

1.376 

In  every  data  set,  the  parameter  estimates  are 
insignificant  at  the  5%  level  of  significance.  However,  the 
model  did  appear  useful  in  that  a joint  F-test  for  model 
significance  was  only  marginally  insignificant. 

As  for  the  residual  diagnostics,  W is  a statistic  formed 
by  considering  a second  order  covariance  analysis  on  the 
squares  of  the  residuals.  It  is  suggested  as  a test  for 
independence  by  Granger  and  Anderson  (1978b),  and  is 
asymptotically  distributed  as  normal  with  mean  zero  and 
variance  unity.  This  test,  as  opposed  to  other  tests  for 
serial  correlation,  is  used  because  standard  tests  tend  to 
pick  up  the  functional  relationship  of  the  residuals  specified 
by  the  model.  The  results  indicate  that  the  test  cannot 
reject  the  hypothesis  that  the  residuals  are  independent. 
However,  under  the  assumption  that  the  residuals  are  Gaussian, 
the  coefficients  of  skewness  and  kurtosis  appear  problematic. 
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Overall,  it  is  quite  clear  that  the  bilinear  model  is 
dominated  by  the  unit  root  process  of  the  data.^  However,  in 
the  next  section  the  estimated  models  will  also  be  judged  by 
looking  at  their  predictive  ability. 

A Time-Varvinq  Parameter  Model 

Next,  two  time-varying  parameter  models  were  estimated. 
To  fit  both  of  these  models  the  following  procedure  was  used: 
Given  the  model  specified  in  equation  (4.7)  let  bt  be  the 
estimate  of  and  an  estimate  of  its  variance,  i.e., 
var(bt,)=St.  At  time  t,  the  observation  is  known.  An 
estimate  of  )3t,  therefore,  from  the  first  equation 

is  (PtPt-i)/P\-i  or  simply  Pt/Pt-i  with  variance  aVP\-i-  Also, 
from  the  second  equation 

Pt  = «Pt-i  u, 

an  estimate  for  is  a*bt-i,  where  a is  the  estimate  of  a,  and 
its  variance  is  (a^s^.i+O^)  . 02  is  added  to  S^-i  because  of  the 
error  term  Ut.  The  time-varying  parameter  estimate  of  is 
found  by  combining  these  two  estimates  with  weights  in  inverse 
proportion  to  their  variances, 


^ Other  specifications  were  fit  and  in  all  cases  the 
models  were  dominated  by  the  unit  root  component  of  the  data. 
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= [ 1 +ZLi]-i[ ^ + 

C /\0  .0^  /\0 


P.P 


c-i 


a^Sg.j,  -t-  0^  o 


(4.16) 


where  the  variance  St  is  given  by 


St  = [. 


1 . "t-li-i 


a^St.,  > 02 

These  recursive  relations  yield  estimates  that  are  equivalent 
to  those  given  by  the  Kalman  Filter.  To  begin  the  recursions 
initial  estimates  are  needed.  The  initial  estimates  are 


- 


t-1 


C-l 


- *c-Pci) 


a = 


1 ip  ^tPc-1 

hi 


(4.17) 


S,  = ^ . 02 


Pi 


b,  = 


^1-^2  _ P*2 


pI 
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From  these  initial  estimates  the  first  set  of  iterations  are 
conducted.  Given  bj,  and  after  a new  series  of  bt's  are 
iterated,  new  estimates  of  a^,  0^,  a,  and  are  constructed. 
With  these  new  estimates,  a new  series  of  b^'s  are 
constructed.  This  procedure  continues  until  the  new 
estimates  are  not  different  from  the  old  or  the  series  {bt) 
does  not  change. 

The  first  model  fit  was  the  one  given  above  and  the 
second  constrained  a=l.  In  both  cases,  when  the  series  {bt} 
converged,  the  values  b^  all  hovered  around  1 with  a variance 
at  each  time  period  roughly  equal  to  2.49943D-12.  Given  this 
result  two  different  tests  for  stability  were  conducted.  In 
both  tests  the  null  hypothesis  is  that  the  parameters  of  the 
model  are  time  invariant.  The  first  test  has  an  alternative 
hypothesis  of  unstable  regression  coefficients  and  is 
conducted  by  splitting  the  entire  sample  into  several 
arbitrary  nonoverlapping  subsamples  and  calculating  the 
between-group-over-within-groups  ratio  of  mean  squares.  Under 
the  null  hypothesis  of  stable  coefficients,  the  test  statistic 
is  distributed  F(p-1,  T-p)  since  there  is  only  one  regressor 
in  the  mean  equation,  ■*"  p is  the  number  of 

nonoverlapping  subsamples.  Test  results  for  my  data  sets 
could  not  reject  the  null  hypothesis  of  stable  coefficients. 
The  second  test  had  the  same  null,  but  was  against  the 
alternative  hypothesis  of  random-walk  coefficients.  This  more 
specific  test  was  conducted  by  considering  the  heteroskedastic 
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form  that  ordinary  least  squares  regression  residuals  have 
under  an  alternative  hypothesis  of  random  walk  coefficients. 
Knowing  that  the  form  depends  on  t*Pt.i,  one  can  follow  Breusch 
and  Pagan  (1979),  by  using  one  half  times  the  explained  sum  of 
squares  from  a regression  of  e\/a^  on  t-PVi/  where  this 
statistic  is  x^(l)  . This  statistic  was  also  insignificant  at 
the  5%  level  of  significance  in  all  data  sets.  Given  the  poor 
fit  of  the  TVP  model  to  futures  data,  I do  not  report  the 
results  nor  use  the  model  for  predictive  purposes. 

A Time-Series  Segmentation  Model 

The  procedure  used  for  estimating  Sclove's  (1983)  model 
was  described  in  the  section  above.  From  a first-order 
autoregressive  representation  to  a third-order  one,  this  model 
failed  to  conclude  that  the  data  were  produced  by  two 
different  regimes  or  states — in  all  cases,  the  algorithm 
converged  to  one  state  of  the  world.  On  the  surface  this 
result  would  not  appear  as  bad  if  the  parameters  of  the  single 
autoregressive  representation  were  stable.  However,  when 
estimating  the  model  using  several  different  combinations  of 
initial  estimates  for  the  two  autoregressive  specifications 
the  parameter  estimates  varied  terribly.  For  each  combination 
of  initial  estimates,  the  parameter  estimates  were  different. 
For  this  reason,  this  model  was  abandoned  and  not  used  for 
predictive  purposes. 
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A Stochastic.  Secnnented  Trends  Model 

The  concept  behind  estimation  of  the  model  is 
straightforward.  Six  population  parameters  determine  the 
probability  law  for  APf  These  parameters  are  given  by  0 = 
0^,02,  PuiPzz)  ' The  unconditional  distribution  of  the 
state  of  the  first  observation,  p(Si=lj0)  = p,  where 


P = 


(1  - p22) 

(1  - pll)  + (1  - p22) 


(4.18) 


The  joint  probability  distribution  for  the  sample  size  T and 
unobserved  states  s is  given  by 


P ( ^ / • • ./APj.,  / • . ■ , s^s  0)  ~ pi^P^i  s^iQ)  * p {Sj>\  ^ 0 ) 


* p(APj._i ! Sj..i;0)p(Sj._i  1 Sj.,2;B) 


(4.19) 


p(APi  I s^;d)  p{s^;d)  . 


The  sample  likelihood  function  is  then  just  the  summation  of 
the  joint  probability  distribution  over  all  possibilities 

(S]^,...,  Sf ) , i . e . , 

2 2 

J^...J^p(APi APj.,Si Sj.;0)  . (4.20) 

Si-l  Sr-1 

A simpler  way  to  evaluate  the  sample  likelihood  function  then 
the  2^  summations  that  it  would  ordinarily  require  is  to  use 
the  algorithm  provided  by  Hamilton  (1989).  Using  this 
algorithm  and  incorporating  a Bayesian  prior,  following 
Hamilton  (1988),  for  the  parameters  of  the  two  states,  the 
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parameter  estimates  are  given  by  the  following  equations: 

T 

52  APj.*p(Sj.=j!  APi,  . . . , APj.;0) 

j (4.21) 

V + p(s^=j\  APj^,  . . . , APy;^) 

c=i 


d 


2 

j 


T 

a + (1/2)  52p(St=j!  AP^,  . . . , APj.:0) 

c-i 


■]  ♦ 


T 


[P  ^ (1/2)  (AP,  - 

C-l 


* p(Sj.=j  ! APj,  . . . , APj,;0) 


+ (1/2)  *v*(Pj)2] 


52p(St=l''5t.i=l!  AP^,  . . . , AP3.;0) 

Ai  = — 

^ p(Sj._i  = l I APj^,  . . . , APj,;0)  + p - p(Si  = l ! AP^,  . . . , APy,-0) 


52p(St=2,St_i=2  ! APj,  . . . , APj.;0) 

A - t-2 

P22  

J^p(Sc_i=2!  AP^,  . . . , APj,;0)  + p - p(Sj=l!  AP^,  . . . , APj,;0) 


The  Bayesian  prior,  incorporated  by  using  the  parameters  a,  )3, 
and  V,  is  used  to  avoid  singularities  of  the  likelihood 
function.  Note  that  the  maximum  likelihood  estimates  are  just 
a special  case  of  the  diffuse  prior  a=/9=v=0  and  the  use  of 
priors  that  are  nonzero  shifts  the  maximum  likelihood 
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estimates  in  the  direction  of  concluding  that  there  is  no 
difference  between  the  two  regimes. 

Given  this  estimation  procedure,  table  4.4  provides  the 
results  of  applying  Engel  and  Hamilton's  (1990)  model  to  the 
futures  price  data.  I encountered  the  same  problem  with 
Hamilton's  model  as  that  found  with  Sclove's  model  when 
specifying  different  autoregressive  representations  under  the 
hypothesized  two  states  of  the  world.  Hence,  it  appears  that 
it  is  not  the  estimation  procedure,  but  the  specification 
which  does  not  appeal  to  the  data.  One  reason  for  the  failure 
of  both  models  may  be  that  they  require  stationary  variables. 
Recall  that  figures  6-10  show  that,  for  these  contracts, 
changes  in  futures  prices  occur  in  the  opposite  direction  very 
frequently.  This  would  lead  one  to  believe  that  a negative 
first-order  autoregressive  representation  would  dominate  a 
positive  one.  Also,  the  levels  shown  in  figures  1-5,  exhibit 
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Table  4.4 

Stochastic,  Segmented  Trends  Model  Estimates 


88(3) 

88(6) 

88(9) 

88(12) 

89(3) 

Ml 

.07159 

(.10876) 

.03715 

(.18825) 

.15040 
( .23603) 

. 12420 
(.22863) 

. 10663 
(.23041) 

M2 

-.00181 

(.00501) 

-.00117 
( .00544) 

-.00482 
( .00579) 

-.00414 
( .00503) 

-.00329 
( .00499) 

Pii 

.84686 

(.10349) 

.82381 

(.16311) 

.80862 
( . 17772) 

.81461 

(.16656) 

.79933 
( . 18167) 

P22 

.99395 
( .00437) 

.99350 

(.00479) 

.99645 
( .00362) 

.99689 

(.00318) 

.99664 

(.00351) 

CTi^ 

.18373 

(.07435) 

.32440 

(.17664) 

.34941 

(.20679) 

.34589 

(.20153) 

.35304 

(.20617) 

.00997 

(.00075) 

.01086 

(.00084) 

.01139 

(.00088) 

.01041 

(.00074) 

.00999 

(.00073) 

Ps„ 

.0041 

.0017 

.0008 

.0491 

.0007 

P 

.0380 

.0356 

.0182 

.0165 

.0165 

Ps[j  is  ths  conditional  probability  P(s^l|AP^ P^)  whera  N*T-50  and  T is  tha  numbar  of  obsarvations 

in  a givan  contract.  Tha  standard  errors  are  in  paranthasas. 


sporadic  movements  even  though  the  series  is  generally  moving 
upward  or  downward  or  appears  to  have  a ' long  swing • . These 
sporadic  movements  are  quite  different  than  what  Hamilton 
encountered  when'  modeling  exchange  rates.  Engel  and 
Hamilton's  (1990)  model  was  able  to  fit  the  data  somewhat 
better.  By  not  specifying  an  autoregressive  representation, 
this  form  of  the  model  was  able  to  distinguish  between  two 
different  states.  However,  even  though  Hamilton's  (1990) 
model  was  able  to  fit  the  futures  data  better  than  his  other 
models,  ninety-five  percent  of  all  observations  in  all  of  my 
data  sets  were  distinguished  as  coming  from  the  second  state 
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of  the  world.  Hence,  this  model  still  does  not  seem  fully 
appropriate  for  financial  futures  price  data. 

By  looking  at  table  4.4,  the  means  of  the  distributions 
Ml  and  M2  are  not  significant  at  conventional  sizes  in  any  of 
the  data  sets.  This  indicates  that  the  trend  that  the  data 
follows  at  any  given  time  period  is  not  well  specified.  Also, 
in  every  contract,  the  variance  for  the  first  distribution  is 
not  significant.  Lastly,  the  conditional  probability  that  the 
last  observation's  state  is  1 and  the  probability  that  the 
first  state  is  l are  both  very  small.  This  yields  further 

evidence  that  the  model  may  not  be  appropriate  for  the  data. 

« 

Prediction  and  Comparison  of  the  Estimated  Models 

For  each  of  the  estimated  models,  prediction  is  carried 
out  for  a 5-day  horizon  up  through  a 50-day  horizon.  In 
total,  10  predictions  will  be  made  for  each  model.  The  mean 
square  error  (MSE)  of  prediction  and  the  Theil  U statistic  for 
each  horizon  is  first  compared  to  the  MSE  of  prediction  and 
the  U statistic  of  the  simple  random  walk  model.  Then,  the 
MSE  of  prediction  and  the  U statistic  of  each  of  the  estimated 
models  is  compared  against  one  another.  The  criterion  for 

choosing  the  best  model  is  simple:  The  lower  the  MSE  of 

« 

prediction  and  the  lower  the  U statistic,  the  better  the 
prediction,  and,  the  better  the  prediction,  the  better  the 


model . 
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The  MSE  of  prediction  is  formally  defined  as: 

MSE  = —V  (4.22) 

where  Prt  is  the  predicted  value  at  time  t.  At  is  the  actual 
value  at  time  t,  and  n is  forecast  horizon.  It  is  the 
simplest  measure  of  forecast  accuracy  and  it  is  the  basis  for 
all  other  measures.  The  Theil  U statistic,  a function  of  the 
MSE  is  given  by: 

iE 

(4.23) 

where  APrt  is  the  first  difference  of  the  predicted  values,  AAt 
is  the  first  differences  of  the  actual  values,  and  n is  the 
forecast  horizon. 

The  MSE  is  an  overall  measure  of  forecast  performance 
that  is  based  purely  on  the  forecast  errors.  On  the  other 
hand,  the  Theil  U statistic  given  in  terms  of  differences,  is 
used  to  measure  the  model's  ability  to  trac)c  turning  points  in 
the  data.  By  using  both  of  these  measures  the  best  model 
should  be  detected. 

Ten  tables  are  given  below,  one  for  each  forecast 
horizon.  The  first  horizon  represents  the  first  five  days 
immediately  following  the  estimation  period,  the  second 
represents  the  first  ten  days  immediately  following  the 
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estimation  period,  the  third  the  first  fifteen  days,  and  so 
forth.  The  longest  forecast  horizon  analyzed  is  fifty  trading 
days . 


Table  4.5 


Prediction  Under  the  First  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN.WALKj 

2.470 

2.698 

RAN . WALK2 

0.340 

1.107 

RAN.WALK3 

3.590 

1.065 

RAN . WALK* 

0.154 

1.462 

RAN . WALK; 

2.443 

1.397 

ARCH-Mi 

2.971 

2.769 

ARCH-M2 

0.335 

1.110 

ARCH-Ma 

4.409 

1.066 

ARCH-M* 

0.159 

1.485 

ARCH-M, 

• 

2.443 

1.398 

GARCH-Mi 

2.454 

2.704 

GARCH-M2 

0.333 

1.108 

GARCH-Ma 

3.556 

1.065 

GARCH-M* 

0.164 

1.462 

GARCH-Mj 

2.423 

1.400 

BILINEARi 

2.442 

2.659 

BILINEAR2 

0.368 

1.095 

BILINEARa 

3.581 

1.068 

BILINEAR* 

0.168 

1.515 

BILINEAR5 

2.441 

1.406 

S EG. TRENDS  1 

2.563 

2.686 

SEG.TRENDS2 

0.360 

1.099 

SEG.TRENDSa 

3.248 

1.055 

SEG. TRENDS* 

0.173 

1.462 

SEG.TRENDSa 

2.328 

1.397 

Not.  that  th«  subscripts  1 5 on  each  of  tha  iDod.la  indicate  the  model  for  contracts  88(3), 

08(6) 09(3)  respectively. 
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Table  4.6 


Prediction  Under  the  Second  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

2.570 

1.770 

RAN . WALK2 

0.193 

1.327 

RAN . WALK3 

2.179 

1.593 

RAN . WALK* 

0.182 

1.303 

RAN.WALKj 

• 

1.767 

1.427 

ARCH-Mi 

2.627 

1.693 

ARCH-M2 

0.191 

1.330 

ARCH-M3 

2.622 

1.704 

ARCH-M* 

0.184 

1.312 

ARCH-M5 

1.768 

1.428 

GARCH-Mi 

2.581 

1.771 

GARCH-M2 

0.190 

1.327 

GARCH-M3 

2.183 

1.597 

GARCH-M* 

0.189 

1.301 

GARCH-M5 

1.751 

1.428 

BILINEARi 

2.559 

1.764 

BILINEAR2 

0.203 

1.281 

BILINEAR3 

2.172 

1.589 

BILINEAR* 

0.192 

1.322 

BILINEAR5 

1.763 

1.431 

S EG. TRENDS  1 

2.465 

1.767 

SEG.TRENDS2 

0.218 

1.324 

SEG.TRENDS3 

2.337 

1.595 

S EG. TRENDS* 

0.204 

1.296 

SEG. TRENDS; 

1.686 

1.426 

Not*  that  the  aubscrlpta  1 5 on  each  of  the  models  Indicate  the  model  for  contracts  88(3), 

88(6) 89(3)  respectively. 
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Table  4.7 


Prediction  Under  the  Third  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

2.366 

1.631 

RAN . WALKj 

0.313 

1.301 

RAN . WALKj 

1.553 

1.596 

RAN.  WALK* 

0.181 

1.205 

RAN. WALK; 

1.260 

1.391 

ARCH-Mi 

2.586 

1.551 

ARCH-Ma 

0.315 

1.304 

ARCH-Mj 

1.998 

1.703 

ARCH-M* 

0.182 

1.211 

ARCH-Mj 

1.260 

1.393 

GARCH-Mi 

2.362 

1.631 

GARCH-Ma 

0.314 

1.300 

GARCH-Mj 

1.552 

1.600 

GARCH-M* 

0.183 

1.204 

GARCH-Mj 

1.244 

1.392 

BILINEARi 

2.360 

1.628 

BILINEARa 

0.323 

1.265 

BILINEAR3 

1.546 

1.592 

BILINEAR* 

0.188 

1.223 

BILINEAR5 

1.253 

1.395 

S EG. TRENDS  1 

2.559 

1.629 

SEG.TRENDSa 

0.285 

1.298 

S EG. TRENDS 3 

1.797 

1.597 

S EG. TRENDS* 

0.195 

1.199 

S EG. TRENDS 5 

1.257 

1.391 

Note  that  the  subscripts  on  each  of  the  models  indicate  the  model  for  contracts  88(3), 

88(6) 89(3)  resp«ctiv«Iy . 
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Table  4.8 


Prediction  Under  the  Fourth  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

1.856 

1.673 

RAN . WALK2 

0.349 

1.470 

RAN . WALK3 

1.443 

1.486 

RAN . WALK<, 

0.256 

1.105 

RAN. WALK; 

1.072 

1.345 

ARCH-Mi 

2.079 

1.657 

ARCH-M2 

0.349 

1.475 

ARCH-M3 

2.005 

1.579 

ARCH-M4 

0.258 

1.109 

ARCH-Mj 

1.072 

1.346 

GARCH-Mi 

1.851 

1.673 

GARCH-M2 

0.348 

1.469 

GARCH-M3 

1.436 

1.489 

GARCH-M* 

• 

0.259 

1.104 

GARCH-Mj 

1.062 

1.346 

BILINEARi 

1.851 

1.668 

BILINEAR2 

0.346 

1.409 

BILINEAR3 

1.436 

1.482 

BILINEAR, 

0.269 

1.119 

BILINEAR, 

1.066 

1.348 

SEG. TRENDS  1 

2.103 

1.670 

SEG.TRENDS2 

0.405 

1.471 

SEG.TRENDS3 

1.563 

1.484 

S EG. TRENDS* 

0.381 

1.108 

S EG. TRENDS, 

1.209 

1.343 

Not«  that  tha  subscripts  on  aach  of  tha  modals  indicata  tha  modal  for  contracts  88(3), 

88(6) 89(3)  raspactlvaly . 
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Table  4 ■ 9 


Prediction  Under  the  Fifth  Forecast  Horizon 


• 

MSE  X 10® 

U-STATISTIC 

RAN . WALKj 

1.654 

1.638 

RAN . WALK2 

0.389 

1.508 

RAN . WALK3 

1.171 

1.558 

RAN.  WALK* 

0.484 

1.339 

RAN . WALKj 

0.956 

1.345 

ARCH-Mj 

1.802 

1.598 

ARCH -M2 

0.390 

1.510 

ARCH -M3 

1.688 

1.661 

ARCH-M* 

0.485 

1.339 

ARCH-M5 

0.955 

1.346 

GARCH-Mi 

1.651 

1.638 

GARCH-M2 

0.391 

1.508 

GARCH-M3 

1.164 

1.563 

GARCH-M* 

0.491 

1.335 

GARCH-Mj 

0.941 

1.346 

BILINEARi 

1.649 

1.632 

BILINEAR2 

0.385 

1.439 

BILINEAR3 

1.164 

1.554 

BILINEAR* 

0.495 

1.325 

BILINEAR5 

0.947 

1.350 

S EG. TRENDS 1 

1.898 

1.636 

SEG.TRENDS2 

0.392 

1.508 

SEG.TRENDS3 

1.583 

1.560 

SEG. TRENDS* 

0.843 

1.340 

SEG.TRENDSj 

1.094 

1.345 

Not#  that  tha  subscripts  on  saeh  o£  tha  modals  indlcata  tha  modal  for  contracts  88(3), 

88(6) 89(3)  raspactlvaly . 
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Table  4.10 


Prediction  Under  the  Sixth  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

1.555 

1.666 

RAN.WALK2 

0.388 

1.484 

RAN . WALK3 

1.057 

1.560 

RAN . WALK^ 

0.423 

1.348 

RAN . WALKj 

0.811 

1.342 

ARCH-Mi 

1.711 

1.639 

ARCH-M2 

0.388 

1.485 

ARCH-M3 

1.530 

1.656 

ARCH-M* 

0.424 

1.348 

ARCH-M5 

0.810 

1.343 

GARCH-Mi 

1.553 

1.666 

GARCH-M2 

0.389 

1.484 

GARCH-M3 

1.053 

1.564 

GARCH-M* 

0.427 

1.344 

GARCH-Mj 

0.798 

1.343 

BILINEARi 

1.548 

1.658 

BILINEAR2 

0.384 

1.423 

BILINEAR3 

1.052 

1.557 

BILINEAR* 

0.431 

1.335 

BILINEAR5 

0.803 

1.346 

S EG. TRENDS  1 

1.844 

1.664 

SEG.TRENDS2 

0.453 

1.485 

SEG.TRENDS3 

1.847 

1.563 

SEG. TRENDS* 

0.867 

1.342 

SEG.TRENDSj 

1.088 

1.397 

Not*  that  th#  subscripts  1 5 on  sach  of  th#  ondala  indicata  tha  modal  for  contracts  88(3), 

86(6), ... ,89(3)  respectively. 
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Table  4.11 


Prediction  Under  the  Seventh  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN.WALKi 

1.410 

1.665 

RAN.WALKz 

0.469 

1.481 

RAN . WALKj 

4.191 

1.763 

RAN . WALK* 

0.399 

1.320 

RAN . WALK, 

0.743 

1.341 

ARCH-Mi 

1.533 

1.625 

ARCH-Mz 

0.469 

1.480 

ARCH-Mj 

5.433 

1.883 

ARCH-M* 

0.399 

1.320 

ARCH-M, 

0.743 

1.342 

GARCH-Mi 

1.407 

1.665 

GARCH-Mz 

0.471 

1.481 

GARCH-Ma 

4.211 

1.757 

GARCH-M* 

0.399 

1.317 

GARCH-Ms 

0.735 

1.342 

BILINEARi 

1.402 

1.656 

BILINEARz 

0.464 

1.422 

BILINEARj 

4.205 

1.755 

BILINEAR* 

0.404 

1.307 

BILINEAR5 

0.737 

1.346 

S EG. TRENDS  1 

1.779 

1.663 

SEG.TRENDSz 

0.501 

1.481 

SEG.TRENDS3 

5.215 

1.763 

S EG. TRENDS* 

0.878 

1.319 

S EG. TRENDS 5 

1.303 

1.342 

Not*  that  tha  subscripts  1 3 on  each  of  tha  osodels  Indicata  tha  model  for  contracts  88(3), 

88(6) 89(3)  raspactivaly . 
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Table  4.12 

Prediction 

Under  the  Eighth 

Forecast  Horizon 

1 MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

1.279 

1.653 

RAN . WALK2 

0.569 

1.426 

RAN . WALK3 

3.757 

1.752 

RAN . WALK* 

0.390 

1.310 

RAN.WALK5 

0.711 

1.345 

ARCH-Mi 

1.429 

1.622 

ARCH-M2 

0.569 

1.426 

ARCH-M3 

4.897 

1.871 

ARCH-M* 

0.390 

1.309 

ARCH-M5 

0.711 

1.346 

GARCH-Mi 

1.276 

1.653 

GARCH-M2 

0.571 

1.426 

GARCH-M3 

3.776 

1.757 

GARCH-M* 

0.394 

1.307 

GARCH-Mj 

0.703 

1.346 

BILINEARj 

1.274 

1.644 

BILINEAR2 

0.561 

1.379 

BILINEAR3 

3.770 

1.755 

BILINEAR* 

0.399 

1.298 

BILINEAR5 

0.706 

1.350 

SEG.TRENDSi 

1.771 

1.651 

SEG.TRENDS2 

0.641 

1.426 

SEG.TRENDS3 

5.145 

1.752 

SEG. TRENDS* 

1.231 

1.311 

S EG. TRENDS 5 

1.404 

1.344 

Nota  that  tha  subscripts  1 5 on  each  of  tha  inodels  indlcata  tha  modal  for  contracts  88(3), 

88(6) , 89(3)  raspactivaly . 
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Table  4.13 


Prediction  Under  the  Ninth  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

1.247 

1.625 

RAN.WALK2 

0.512 

1.470 

RAN . WALK3 

3.352 

1.755 

RAN . WALK* 

0.439 

1.256 

RAN. WALK; 

0.665 

1.333 

ARCH-Mi 

• 

1.418 

1.600 

ARCH -M2 

0.513 

1.469 

ARCH-M3 

4.419 

1.874 

ARCH-M* 

0.439 

1.255 

ARCH-Mj 

0.665 

1.334 

GARCH-Mi 

1.242 

1.625 

GARCH-M2 

0.571 

1.760 

GARCH-M3 

3.366 

1.065 

GARCH-M* 

0.438 

1.254 

GARCH-M5 

0.654 

1.334 

BILINEARi 

1.243 

1.614 

BILINEAR2 

0.507 

1.415 

BILINEAR3 

3.362 

1.758 

BILINEAR* 

0.447 

1.242 

BILINEARj 

0.658 

1.338 

SEG. TRENDS  1 

1.933 

1.624 

S EG. TRENDS 2 

0.600 

1.470 

SEG.TRENDS3 

5.091 

1.756 

SEG. TRENDS* 

1.280 

1.256 

SEG. TRENDS; 

1.454 

1.333 

Note  that  the  subscripts  1 5 on  each  of  the  models  indicate  the  model  for  contracts  88(3), 

88(6) ,89(3)  respectively. 
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Table  4.14 


Prediction  Under  the  Tenth  Forecast  Horizon 


MSE  X 10® 

U-STATISTIC 

RAN . WALKi 

1.164 

1.626 

RAN . WALKz 

0.470 

1.470 

RAN.WALK3 

3.018 

1.755 

RAN . WALK* 

0.448 

1.252 

RAN . WALK5 

0.638 

1.326 

ARCH-Mi 

1.346 

1.600 

ARCH-M2 

• 

0.471 

1.469 

ARCH-M3 

4.011 

1.874 

ARCH-M* 

0.447 

1.251 

ARCH-M5 

0.637 

1.327 

GARCH-Mi 

1.159 

1.626 

GARCH-M2 

0.472 

1.469 

GARCH-M3 

3.031 

1.759 

GARCH-M* 

0.444 

1.251 

GARCH-M5 

0.625 

1.327 

BILINEARi 

1.160 

1.612 

BILINEAR2 

0.466 

1.414 

BILINEAR3 

3.027 

1.758 

BILINEAR* 

0.456 

1.239 

BILINEAR5 

0.630 

1.332 

S EG. TRENDS 1 

1.957 

1.624 

SEG.TRENDS2 

0.580 

1.470 

S EG. TRENDS 3 

5.271 

1.755 

SEG. TRENDS* 

1.407 

1.254 

S EG. TRENDS; 

1.544 

1.326 

Note  that  the  subscripts  on  each  of  the  models  indicate  the  model  for  contracts  68(3), 

66(6) 69(3 ) respectively. 


When  compared  to  the  random  walk,  the  ARCH-M  model 
predicts  worse  in  short  horizons.  The  MSE  of  prediction  of 
the  random  walk  model  for  all  contracts  is  at  least  as  small 
as  that  of  the  ARCH-M  model  in  the  first  five  horizons.  The 
random  walk  also  predicts  better  for  horizons  six  through  ten 
in  three  of  the  five  data  sets. 


By  looking  at  the  U- 
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statistics  it  is  also  clear  that  the  random  walk  is  better 
able  to  predict  turning  points  in  the  data.  This  is  somewhat 
remarkable  because  the  random  walk  by  definition  cannot 
predict  turning  points.  Only  in  contract  88(3)  does  the  ARCH- 
M predict  turning  points  more  accurately.  Given  these  results 
the  ARCH-M  model  does  not  seem  appropriate  for  capturing  the 
important  nonlinearities  of  the  data.  Thus,  even  though  there 
are  theoretical  reasons  to  believe  that  conditional  moments 
are  important  determinants  of  asset  prices®,  these  results 
should  guestion  the  extent  to  which  the  ARCH  model  has  been 
used  in  modeling  financial  data,  especially  futures  data. 

The  GARCH-M  model  performed  somewhat  better  than  the 
ARCH-M  model,  but  still  not  sufficiently  well  to  believe  that 
the  nonlinearities  in  the  data  were  completely  modeled.  In 
two  of  the  five  data  sets,  the  GARCH-M  model  had  a smaller  MSE 
of  prediction  than  the  random  walk.  When  considering  all  five 
contracts,  in  twenty-nine  of  the  possible  fifty  horizons  the 
GARCH-M  model  had  a smaller  MSE  of  prediction.  In  addition, 
the  prediction  horizon  did  not  appear  to  be  important. 
Depending  upon  the  particular  contract,  either  the  random  walk 
or  the  GARCH-M  model  dominated  in  every  horizon.  However,  the 
ability  of  the  GARCH-M  model  to  track  turning  points  was  much 
worse  than  that  of  the  random  walk.  In  three  of  the  five  data 
sets  the  random  walk  had  smaller  U-statistics.  These  results 

® Many  intertemporal  asset-pricing  models  give  rise  to 
Euler  equations  that  involve  conditional  expectations  of 
marginal  utilities  across  time  periods,  (see  Lucas  (1978)). 
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should  not  appear  surprising  since  the  ARCH-M  model  also  could 
not  outperform  the  simple  linear  model. 

The  bilinear  model  appeared  to  be  the  best  model  from  all 
of  the  nonlinear  models  considered.  The  random  walk,  when 
compared  to  the  bilinear  model,  had  lower  MSE  of  predictions 
in  only  one  of  the  five  contracts.  The  bilinear  model  was 
also  better  able  to  detect  turning  points.  In  thirty-one  of 
the  fifty  possible  comparisons,  the  bilinear  model  had  lower 
U-statistics  than  the  random  walk.  Like  the  comparison  made 
between  the  GARCH-M  model  and  the  random  walk,  the  prediction 
horizon  did  not  appear  to  be  important  here  either.  The 
bilinear  model  dominated  the  random  walk's  performance  for 
every  horizon  and  in  every  contract  except  88(12). 

When  compared  to  the  random  walk,  the  worst  model  was  the 
stochastic,  segmented  trends  (SST)  model.  The  random  walk 
performed  better  in  every  contract.  The  SST  model  had  lower 
MSE's  of  prediction  in  six  of  the  fifty  possible  cases.  Given 
the  poor  results  in  the  estimation  period,  these  prediction 
results  should  not  be  surprising.  Regardless  of  which  futures 
contract  was  modeled,  roughly  ninety  percent  of  the  time  the 
SST  model  estimated  the  data  to  come  from  state  two. 

The  ability  of  the  SST  model  to  predict  turning  points 
was  somewhat  better  than  its  overall  performance.  For 
contract  88(3),  it  was  able  to  predict  turning  points  better 
than  the  random  walk  in  every  horizon.  Overall,  the  SST  model 
had  lower  U-statiStics  in  the  earlier  horizons. 
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Among  just  the  nonlinear  models,  the  SST  model  still  has 
the  largest  MSE  of  prediction.  However,  if  the  performance  of 
the  models  are  just  measured  by  looking  at  the  MSE  in  the 
first  horizon,  then  the  SST  model  performs  better  than  both 
the  ARCH-M  and  bilinear  models  and  better  than  the  GARCH-M 
model  in  two  of  the  five  data  sets. 

The  ARCH-M  model  performs  worse  than  both  the  bilinear 
and  GARCH-M  models,  although,  for  contract  88(12),  the  ARCH-M 
model  has  the  lowest  MSE  of  prediction  out  of  all  the  models. 
However,  its  MSE  of  prediciton  is  still  not  much  lower  even 
for  this  contract. 

The  GARCH-M  and  bilinear  models  have  nearly  equal  MSE  of 
predictions.  However,  there  are  still  some  marked  differences 
between  their  performances.  GARCH-M  outperforms  the  bilinear 
model  in  earlier  horizons  and  the  bilinear  model  is  clearly 
the  better  model  for  horizons  6 through  10. 

Considering  just  the  ability  to  predict  turning  points, 
the  bilinear  model  is  undoubtedly  the  best  model.  However, 
the  SST  model  had  the  lowest  U-statistics  when  considering 
just  the  first  horizon.  GARCH-M  is  better  than  the  ARCH-M 
model  given  this  criterion,  and  the  ARCH-M  model  is  definitely 

the  worst  of  all  models,  including  the  SST  model,  for  all 
horizons. 

Given  these  two  prediction  criteria,  the  bilinear  model 
is  able  to  model  futures  prices  better  than  every  nonlinear 
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model  considered.  Against  the  random  walk,  it  also  performed 
the  best.  It  seems  straightforward  to  conclude  then  that 
bilinear  models  are  the  best  for  T-bill  futures  prices,  and 
that  the  popular  family  of  ARCH  models  may  be  somewhat  abused 
when  used  to  model  financial  data. 


CHAPTER  5 

SUMMARY  AND  CONCLUSION 


Several  important  issues  relating  to  the  time  series 
properties  of  futures  prices  have  been  studied  in  the  previous 
chapters.  These  include  the  stationarity  of  90-day  U.S. 
Treasury  Bill  futures  prices,  the  validity  of  the  random  walk 
hypothesis,  the  existence  of  nonlinear  dependence,  and  the 
exploitability  of  the  nonlinear  dependence  that  is  found  in 
futures  prices  in  terms  of  prediction.  Results  in  chapter  2 
clearly  confirm  .that  Treasury  Bill  futures  prices  are 
nonstationary.  In  addition,  I find  that  the  best  random  walk 
model  is  one  with  neither  a drift  nor  trend  term.  These 
results  confirm  the  results  of  other  studies  of  financial 
futures  prices  and  provide  a necessary  condition  for  the 
random  walk  hypothesis  to  be  affirmed. 

It  has  been  shown  by  using  both  nonparametric  and 
parametric  tests  of  dependence  that  first  differences  of 
futures  prices  contain  no  significant  linear  dependence.  On 
the  other  hand,  the  Brock,  Diechert,  and  Scheinkman  test,  the 
ARCH  specification  test,  and  Tsay's  test  for  nonlinearity,  all 
clearly  demonstrated  that  nonlinear  dependence  is  present. 
Thus,  the  random  walk  hypothesis  cannot  be  verified.  Hseih's 
test  to  distinguish  between  additive  and  multiplicative 
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nonlinear  dependence  indicates  that  the  data  primarily  contain 
multiplicative  dependence.  However,  using  two  predictive 
criteria  in  chapter  4,  I find  that  the  bilinear  model  is  the 
best  nonlinear  model  for  the  data.  Given  this  result  and  the 
fact  that  a bilinear  process  exhibits  additive  nonlinear 
dependence,  it  is  not  clear  that  Hseih's  test  is  powerful 
enough  to  detect  the  different  types  of  nonlinear  dependence 
found  in  the  data. 

As  mentioned  above,  the  forecasting  performance  of  the 
bilinear  model  is  better  than  that  of  the  random  walk  model 
and  the  other  nonlinear  models  considered.  Time-varying 
parameter  models  and  several  versions  of  Sclove's  (1983)  time 
series  segmentation  model  were  found  to  be  inappropriate  for 
these  futures  prices.  The  GARCH-M  model  was  the  second  best, 
the  ARCH-M  model  third  best,  and  Hamilton's  (1989)  stochastic, 
segmented  trends  ^model  worst  of  all  the  nonlinear  models 
estimated.  However,  in  earlier  horizons,  Hamilton's  model  was 
able  to  predict  turning  points  better  than  most  models.  The 
most  important  results  of  chapter  4 are  that:  nonlinear 

dependence  can  be  exploited  for  predictive  purposes;  the 
model  is  the  best  model  for  these  futures  data;  some 
nonlinear  models  predict  better  than  the  random  walk;  and  that 
the  ARCH  family  of  models  are  probably  being  misused  or 


abused. 
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